Dataset statistics
| Number of variables | 16 |
|---|---|
| Number of observations | 48895 |
| Missing cells | 20141 |
| Missing cells (%) | 2.6% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 23.5 MiB |
| Average record size in memory | 503.0 B |
Variable types
| NUM | 10 |
|---|---|
| CAT | 6 |
Reproduction
| Analysis started | 2020-07-10 13:05:02.305748 |
|---|---|
| Analysis finished | 2020-07-10 13:06:41.940539 |
| Duration | 1 minute and 39.63 seconds |
| Version | pandas-profiling v2.8.0 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
name has a high cardinality: 47905 distinct values | High cardinality |
host_name has a high cardinality: 11452 distinct values | High cardinality |
neighbourhood has a high cardinality: 221 distinct values | High cardinality |
last_review has a high cardinality: 1764 distinct values | High cardinality |
last_review has 10052 (20.6%) missing values | Missing |
reviews_per_month has 10052 (20.6%) missing values | Missing |
minimum_nights is highly skewed (γ1 = 21.82727453) | Skewed |
name is uniformly distributed | Uniform |
id has unique values | Unique |
number_of_reviews has 10052 (20.6%) zeros | Zeros |
availability_365 has 17533 (35.9%) zeros | Zeros |
| Distinct count | 48895 |
|---|---|
| Unique (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 19017143.236179568 |
|---|---|
| Minimum | 2539 |
| Maximum | 36487245 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 382.1 KiB |
Quantile statistics
| Minimum | 2539 |
|---|---|
| 5-th percentile | 1222382.7 |
| Q1 | 9471945 |
| median | 19677284 |
| Q3 | 29152178.5 |
| 95-th percentile | 35259101.2 |
| Maximum | 36487245 |
| Range | 36484706 |
| Interquartile range (IQR) | 19680233.5 |
Descriptive statistics
| Standard deviation | 10983108.39 |
|---|---|
| Coefficient of variation (CV) | 0.5775372383 |
| Kurtosis | -1.227748342 |
| Mean | 19017143.24 |
| Median Absolute Deviation (MAD) | 9908242 |
| Skewness | -0.09025737546 |
| Sum | 9.298432185e+11 |
| Variance | 1.206286698e+14 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 11667455 | 1 | < 0.1% | |
| 7851219 | 1 | < 0.1% | |
| 33138268 | 1 | < 0.1% | |
| 1624665 | 1 | < 0.1% | |
| 19387402 | 1 | < 0.1% | |
| 18516103 | 1 | < 0.1% | |
| 29802895 | 1 | < 0.1% | |
| 19983575 | 1 | < 0.1% | |
| 22078678 | 1 | < 0.1% | |
| 33684693 | 1 | < 0.1% | |
| 34418130 | 1 | < 0.1% | |
| 24827469 | 1 | < 0.1% | |
| 35653330 | 1 | < 0.1% | |
| 34065052 | 1 | < 0.1% | |
| 11222221 | 1 | < 0.1% | |
| 15542745 | 1 | < 0.1% | |
| 27417822 | 1 | < 0.1% | |
| 20055242 | 1 | < 0.1% | |
| 11801800 | 1 | < 0.1% | |
| 27653212 | 1 | < 0.1% | |
| 34157056 | 1 | < 0.1% | |
| 3098703 | 1 | < 0.1% | |
| 4629726 | 1 | < 0.1% | |
| 10966240 | 1 | < 0.1% | |
| 23086483 | 1 | < 0.1% | |
| Other values (48870) | 48870 | 99.9% |
| Value | Count | Frequency (%) | |
| 2539 | 1 | < 0.1% | |
| 2595 | 1 | < 0.1% | |
| 3647 | 1 | < 0.1% | |
| 3831 | 1 | < 0.1% | |
| 5022 | 1 | < 0.1% | |
| 5099 | 1 | < 0.1% | |
| 5121 | 1 | < 0.1% | |
| 5178 | 1 | < 0.1% | |
| 5203 | 1 | < 0.1% | |
| 5238 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 36487245 | 1 | < 0.1% | |
| 36485609 | 1 | < 0.1% | |
| 36485431 | 1 | < 0.1% | |
| 36485057 | 1 | < 0.1% | |
| 36484665 | 1 | < 0.1% | |
| 36484363 | 1 | < 0.1% | |
| 36484087 | 1 | < 0.1% | |
| 36483152 | 1 | < 0.1% | |
| 36483010 | 1 | < 0.1% | |
| 36482809 | 1 | < 0.1% |
| Distinct count | 47905 |
|---|---|
| Unique (%) | 98.0% |
| Missing | 16 |
| Missing (%) | < 0.1% |
| Memory size | 382.1 KiB |
| Hillside Hotel | 18 |
|---|---|
| Home away from home | 17 |
| New york Multi-unit building | 16 |
| Brooklyn Apartment | 12 |
| Private Room | 11 |
| Other values (47900) |
| Value | Count | Frequency (%) | |
| Hillside Hotel | 18 | < 0.1% | |
| Home away from home | 17 | < 0.1% | |
| New york Multi-unit building | 16 | < 0.1% | |
| Brooklyn Apartment | 12 | < 0.1% | |
| Private Room | 11 | < 0.1% | |
| Loft Suite @ The Box House Hotel | 11 | < 0.1% | |
| Artsy Private BR in Fort Greene Cumberland | 10 | < 0.1% | |
| Private room | 10 | < 0.1% | |
| Cozy Brooklyn Apartment | 8 | < 0.1% | |
| Private room in Brooklyn | 8 | < 0.1% | |
| Private room in Williamsburg | 8 | < 0.1% | |
| Beautiful Brooklyn Brownstone | 8 | < 0.1% | |
| Harlem Gem | 7 | < 0.1% | |
| Cozy East Village Apartment | 6 | < 0.1% | |
| New York Apartment | 6 | < 0.1% | |
| Home Away From Home | 6 | < 0.1% | |
| Private room in Manhattan | 6 | < 0.1% | |
| Bushwick Oasis | 6 | < 0.1% | |
| Cozy Room | 6 | < 0.1% | |
| IN MINT CONDITION-STUDIOS EAST 44TH/UNITED NATIONS | 6 | < 0.1% | |
| West Village Apartment | 6 | < 0.1% | |
| Home Sweet Home | 6 | < 0.1% | |
| Cozy Private Room | 5 | < 0.1% | |
| A CLASSIC NYC NEIGHBORHOOD-EAST 86TH/5TH AVENUE | 5 | < 0.1% | |
| Private Room in Williamsburg | 5 | < 0.1% | |
| Other values (47880) | 48666 | 99.5% | |
| (Missing) | 16 | < 0.1% |
Length
| Max length | 179 |
|---|---|
| Median length | 36 |
| Mean length | 36.90005113 |
| Min length | 1 |
Most occurring characters
| Value | Count | Frequency (%) | |
| 251424 | 13.9% | ||
| e | 124635 | 6.9% | |
| o | 122324 | 6.8% | |
| t | 105261 | 5.8% | |
| a | 103602 | 5.7% | |
| r | 97946 | 5.4% | |
| i | 94651 | 5.2% | |
| n | 94643 | 5.2% | |
| l | 51723 | 2.9% | |
| m | 49121 | 2.7% | |
| s | 48092 | 2.7% | |
| u | 46324 | 2.6% | |
| d | 38109 | 2.1% | |
| h | 31170 | 1.7% | |
| B | 29965 | 1.7% | |
| p | 29765 | 1.6% | |
| y | 28894 | 1.6% | |
| S | 26481 | 1.5% | |
| c | 23841 | 1.3% | |
| g | 21698 | 1.2% | |
| C | 20989 | 1.2% | |
| A | 19424 | 1.1% | |
| w | 19006 | 1.1% | |
| R | 17945 | 1.0% | |
| b | 17763 | 1.0% | |
| Other values (751) | 289432 | 16.0% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 1206256 | 66.9% | |
| Uppercase Letter | 270574 | 15.0% | |
| Space Separator | 251428 | 13.9% | |
| Other Punctuation | 33826 | 1.9% | |
| Decimal Number | 25321 | 1.4% | |
| Dash Punctuation | 6878 | 0.4% | |
| Math Symbol | 2738 | 0.2% | |
| Other Letter | 2547 | 0.1% | |
| Close Punctuation | 1537 | 0.1% | |
| Open Punctuation | 1395 | 0.1% | |
| Other Symbol | 879 | < 0.1% | |
| Final Punctuation | 238 | < 0.1% | |
| Control | 185 | < 0.1% | |
| Nonspacing Mark | 179 | < 0.1% | |
| Currency Symbol | 94 | < 0.1% | |
| Initial Punctuation | 48 | < 0.1% | |
| Connector Punctuation | 43 | < 0.1% | |
| Modifier Letter | 37 | < 0.1% | |
| Modifier Symbol | 16 | < 0.1% | |
| Other Number | 9 | < 0.1% |
Most frequent Uppercase Letter characters
| Value | Count | Frequency (%) | |
| B | 29965 | 11.1% | |
| S | 26481 | 9.8% | |
| C | 20989 | 7.8% | |
| A | 19424 | 7.2% | |
| R | 17945 | 6.6% | |
| P | 14623 | 5.4% | |
| E | 14350 | 5.3% | |
| L | 14062 | 5.2% | |
| M | 11930 | 4.4% | |
| N | 11701 | 4.3% | |
| T | 11570 | 4.3% | |
| H | 10891 | 4.0% | |
| O | 8934 | 3.3% | |
| W | 8136 | 3.0% | |
| G | 7620 | 2.8% | |
| I | 6763 | 2.5% | |
| U | 6128 | 2.3% | |
| F | 5934 | 2.2% | |
| D | 5827 | 2.2% | |
| Y | 5640 | 2.1% | |
| V | 4517 | 1.7% | |
| K | 2607 | 1.0% | |
| Q | 2246 | 0.8% | |
| J | 1240 | 0.5% | |
| Z | 550 | 0.2% | |
| Other values (18) | 501 | 0.2% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| e | 124635 | 10.3% | |
| o | 122324 | 10.1% | |
| t | 105261 | 8.7% | |
| a | 103602 | 8.6% | |
| r | 97946 | 8.1% | |
| i | 94651 | 7.8% | |
| n | 94643 | 7.8% | |
| l | 51723 | 4.3% | |
| m | 49121 | 4.1% | |
| s | 48092 | 4.0% | |
| u | 46324 | 3.8% | |
| d | 38109 | 3.2% | |
| h | 31170 | 2.6% | |
| p | 29765 | 2.5% | |
| y | 28894 | 2.4% | |
| c | 23841 | 2.0% | |
| g | 21698 | 1.8% | |
| w | 19006 | 1.6% | |
| b | 17763 | 1.5% | |
| f | 17153 | 1.4% | |
| v | 13552 | 1.1% | |
| k | 13472 | 1.1% | |
| z | 6162 | 0.5% | |
| x | 4484 | 0.4% | |
| q | 2362 | 0.2% | |
| Other values (43) | 503 | < 0.1% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 251424 | > 99.9% | ||
| 4 | < 0.1% |
Most frequent Other Punctuation characters
| Value | Count | Frequency (%) | |
| , | 9177 | 27.1% | |
| ! | 7855 | 23.2% | |
| / | 5230 | 15.5% | |
| . | 4375 | 12.9% | |
| & | 3182 | 9.4% | |
| ' | 1074 | 3.2% | |
| * | 1021 | 3.0% | |
| : | 597 | 1.8% | |
| # | 555 | 1.6% | |
| " | 294 | 0.9% | |
| @ | 189 | 0.6% | |
| ; | 121 | 0.4% | |
| • | 62 | 0.2% | |
| % | 29 | 0.1% | |
| ? | 21 | 0.1% | |
| 。 | 15 | < 0.1% | |
| · | 13 | < 0.1% | |
| 、 | 6 | < 0.1% | |
| \ | 6 | < 0.1% | |
| ¡ | 3 | < 0.1% | |
| ・ | 1 | < 0.1% |
Most frequent Decimal Number characters
| Value | Count | Frequency (%) | |
| 1 | 8661 | 34.2% | |
| 2 | 6830 | 27.0% | |
| 3 | 2560 | 10.1% | |
| 5 | 2164 | 8.5% | |
| 0 | 2115 | 8.4% | |
| 4 | 1307 | 5.2% | |
| 6 | 569 | 2.2% | |
| 7 | 450 | 1.8% | |
| 8 | 399 | 1.6% | |
| 9 | 266 | 1.1% |
Most frequent Dash Punctuation characters
| Value | Count | Frequency (%) | |
| - | 6804 | 98.9% | |
| — | 47 | 0.7% | |
| – | 26 | 0.4% | |
| ― | 1 | < 0.1% |
Most frequent Math Symbol characters
| Value | Count | Frequency (%) | |
| + | 1382 | 50.5% | |
| | | 992 | 36.2% | |
| ~ | 271 | 9.9% | |
| = | 34 | 1.2% | |
| > | 25 | 0.9% | |
| < | 20 | 0.7% | |
| → | 6 | 0.2% | |
| ⋆ | 4 | 0.1% | |
| √ | 2 | 0.1% | |
| × | 1 | < 0.1% | |
| ⊹ | 1 | < 0.1% |
Most frequent Currency Symbol characters
| Value | Count | Frequency (%) | |
| $ | 94 | 100.0% |
Most frequent Open Punctuation characters
| Value | Count | Frequency (%) | |
| ( | 1339 | 96.0% | |
| [ | 36 | 2.6% | |
| { | 9 | 0.6% | |
| 【 | 8 | 0.6% | |
| 《 | 3 | 0.2% |
Most frequent Close Punctuation characters
| Value | Count | Frequency (%) | |
| ) | 1480 | 96.3% | |
| ] | 37 | 2.4% | |
| } | 9 | 0.6% | |
| 】 | 8 | 0.5% | |
| 》 | 3 | 0.2% |
Most frequent Final Punctuation characters
| Value | Count | Frequency (%) | |
| ’ | 200 | 84.0% | |
| ” | 38 | 16.0% |
Most frequent Other Symbol characters
| Value | Count | Frequency (%) | |
| ★ | 266 | 30.3% | |
| ❤ | 168 | 19.1% | |
| ☆ | 105 | 11.9% | |
| ♥ | 38 | 4.3% | |
| ⭐ | 35 | 4.0% | |
| ✨ | 34 | 3.9% | |
| ❥ | 25 | 2.8% | |
| ✿ | 15 | 1.7% | |
| ☀ | 15 | 1.7% | |
| ✰ | 14 | 1.6% | |
| ✴ | 11 | 1.3% | |
| ♀ | 11 | 1.3% | |
| ⚡ | 8 | 0.9% | |
| ✪ | 8 | 0.9% | |
| ♡ | 6 | 0.7% | |
| ✌ | 6 | 0.7% | |
| ♛ | 6 | 0.7% | |
| ♦ | 6 | 0.7% | |
| ⚓ | 6 | 0.7% | |
| ➡ | 5 | 0.6% | |
| ✦ | 4 | 0.5% | |
| ⚜ | 4 | 0.5% | |
| ✺ | 4 | 0.5% | |
| ♕ | 4 | 0.5% | |
| ▲ | 4 | 0.5% | |
| Other values (35) | 71 | 8.1% |
Most frequent Nonspacing Mark characters
| Value | Count | Frequency (%) | |
| ️ | 165 | 92.2% | |
| ︎ | 14 | 7.8% |
Most frequent Connector Punctuation characters
| Value | Count | Frequency (%) | |
| _ | 42 | 97.7% | |
| ‿ | 1 | 2.3% |
Most frequent Other Number characters
| Value | Count | Frequency (%) | |
| ² | 9 | 100.0% |
Most frequent Control characters
| Value | Count | Frequency (%) | |
| 185 | 100.0% |
Most frequent Modifier Symbol characters
| Value | Count | Frequency (%) | |
| ^ | 9 | 56.2% | |
| ` | 4 | 25.0% | |
| ´ | 3 | 18.8% |
Most frequent Initial Punctuation characters
| Value | Count | Frequency (%) | |
| “ | 40 | 83.3% | |
| ‘ | 8 | 16.7% |
Most frequent Other Letter characters
| Value | Count | Frequency (%) | |
| 房 | 82 | 3.2% | |
| 家 | 46 | 1.8% | |
| 中 | 44 | 1.7% | |
| 间 | 41 | 1.6% | |
| 的 | 38 | 1.5% | |
| 拉 | 37 | 1.5% | |
| 法 | 36 | 1.4% | |
| 盛 | 36 | 1.4% | |
| 大 | 30 | 1.2% | |
| 约 | 29 | 1.1% | |
| 纽 | 28 | 1.1% | |
| 人 | 28 | 1.1% | |
| 地 | 27 | 1.1% | |
| 分 | 26 | 1.0% | |
| 公 | 25 | 1.0% | |
| 心 | 25 | 1.0% | |
| 近 | 25 | 1.0% | |
| 温 | 23 | 0.9% | |
| 馨 | 21 | 0.8% | |
| 单 | 21 | 0.8% | |
| 曼 | 20 | 0.8% | |
| 立 | 20 | 0.8% | |
| 便 | 20 | 0.8% | |
| 旅 | 20 | 0.8% | |
| 寓 | 19 | 0.7% | |
| Other values (505) | 1780 | 69.9% |
Most frequent Modifier Letter characters
| Value | Count | Frequency (%) | |
| ゙ | 21 | 56.8% | |
| ー | 11 | 29.7% | |
| ゚ | 5 | 13.5% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 1476627 | 81.8% | |
| Common | 324672 | 18.0% | |
| Han | 2237 | 0.1% | |
| Cyrillic | 191 | < 0.1% | |
| Inherited | 179 | < 0.1% | |
| Katakana | 136 | < 0.1% | |
| Hiragana | 70 | < 0.1% | |
| Hangul | 70 | < 0.1% | |
| Hebrew | 31 | < 0.1% | |
| Georgian | 13 | < 0.1% | |
| Devanagari | 2 | < 0.1% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| e | 124635 | 8.4% | |
| o | 122324 | 8.3% | |
| t | 105261 | 7.1% | |
| a | 103602 | 7.0% | |
| r | 97946 | 6.6% | |
| i | 94651 | 6.4% | |
| n | 94643 | 6.4% | |
| l | 51723 | 3.5% | |
| m | 49121 | 3.3% | |
| s | 48092 | 3.3% | |
| u | 46324 | 3.1% | |
| d | 38109 | 2.6% | |
| h | 31170 | 2.1% | |
| B | 29965 | 2.0% | |
| p | 29765 | 2.0% | |
| y | 28894 | 2.0% | |
| S | 26481 | 1.8% | |
| c | 23841 | 1.6% | |
| g | 21698 | 1.5% | |
| C | 20989 | 1.4% | |
| A | 19424 | 1.3% | |
| w | 19006 | 1.3% | |
| R | 17945 | 1.2% | |
| b | 17763 | 1.2% | |
| f | 17153 | 1.2% | |
| Other values (53) | 196102 | 13.3% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 251424 | 77.4% | ||
| , | 9177 | 2.8% | |
| 1 | 8661 | 2.7% | |
| ! | 7855 | 2.4% | |
| 2 | 6830 | 2.1% | |
| - | 6804 | 2.1% | |
| / | 5230 | 1.6% | |
| . | 4375 | 1.3% | |
| & | 3182 | 1.0% | |
| 3 | 2560 | 0.8% | |
| 5 | 2164 | 0.7% | |
| 0 | 2115 | 0.7% | |
| ) | 1480 | 0.5% | |
| + | 1382 | 0.4% | |
| ( | 1339 | 0.4% | |
| 4 | 1307 | 0.4% | |
| ' | 1074 | 0.3% | |
| * | 1021 | 0.3% | |
| | | 992 | 0.3% | |
| : | 597 | 0.2% | |
| 6 | 569 | 0.2% | |
| # | 555 | 0.2% | |
| 7 | 450 | 0.1% | |
| 8 | 399 | 0.1% | |
| " | 294 | 0.1% | |
| Other values (108) | 2836 | 0.9% |
Most frequent Inherited characters
| Value | Count | Frequency (%) | |
| ️ | 165 | 92.2% | |
| ︎ | 14 | 7.8% |
Most frequent Han characters
| Value | Count | Frequency (%) | |
| 房 | 82 | 3.7% | |
| 家 | 46 | 2.1% | |
| 中 | 44 | 2.0% | |
| 间 | 41 | 1.8% | |
| 的 | 38 | 1.7% | |
| 拉 | 37 | 1.7% | |
| 法 | 36 | 1.6% | |
| 盛 | 36 | 1.6% | |
| 大 | 30 | 1.3% | |
| 约 | 29 | 1.3% | |
| 纽 | 28 | 1.3% | |
| 人 | 28 | 1.3% | |
| 地 | 27 | 1.2% | |
| 分 | 26 | 1.2% | |
| 公 | 25 | 1.1% | |
| 心 | 25 | 1.1% | |
| 近 | 25 | 1.1% | |
| 温 | 23 | 1.0% | |
| 馨 | 21 | 0.9% | |
| 单 | 21 | 0.9% | |
| 曼 | 20 | 0.9% | |
| 立 | 20 | 0.9% | |
| 便 | 20 | 0.9% | |
| 旅 | 20 | 0.9% | |
| 寓 | 19 | 0.8% | |
| Other values (386) | 1470 | 65.7% |
Most frequent Katakana characters
| Value | Count | Frequency (%) | |
| ン | 14 | 10.3% | |
| ク | 12 | 8.8% | |
| リ | 10 | 7.4% | |
| ハ | 9 | 6.6% | |
| ッ | 9 | 6.6% | |
| ア | 9 | 6.6% | |
| ス | 8 | 5.9% | |
| ト | 7 | 5.1% | |
| フ | 6 | 4.4% | |
| ウ | 6 | 4.4% | |
| ル | 5 | 3.7% | |
| タ | 4 | 2.9% | |
| ム | 4 | 2.9% | |
| イ | 3 | 2.2% | |
| レ | 3 | 2.2% | |
| エ | 3 | 2.2% | |
| ィ | 3 | 2.2% | |
| キ | 2 | 1.5% | |
| ロ | 2 | 1.5% | |
| ミ | 2 | 1.5% | |
| マ | 2 | 1.5% | |
| シ | 2 | 1.5% | |
| ェ | 2 | 1.5% | |
| ニ | 1 | 0.7% | |
| ュ | 1 | 0.7% | |
| Other values (7) | 7 | 5.1% |
Most frequent Hiragana characters
| Value | Count | Frequency (%) | |
| の | 16 | 22.9% | |
| で | 7 | 10.0% | |
| か | 7 | 10.0% | |
| ら | 6 | 8.6% | |
| お | 5 | 7.1% | |
| い | 4 | 5.7% | |
| な | 4 | 5.7% | |
| に | 3 | 4.3% | |
| き | 2 | 2.9% | |
| く | 2 | 2.9% | |
| す | 2 | 2.9% | |
| し | 1 | 1.4% | |
| て | 1 | 1.4% | |
| み | 1 | 1.4% | |
| ま | 1 | 1.4% | |
| せ | 1 | 1.4% | |
| ん | 1 | 1.4% | |
| る | 1 | 1.4% | |
| が | 1 | 1.4% | |
| わ | 1 | 1.4% | |
| ど | 1 | 1.4% | |
| こ | 1 | 1.4% | |
| も | 1 | 1.4% |
Most frequent Cyrillic characters
| Value | Count | Frequency (%) | |
| а | 26 | 13.6% | |
| о | 18 | 9.4% | |
| т | 17 | 8.9% | |
| н | 15 | 7.9% | |
| е | 13 | 6.8% | |
| к | 11 | 5.8% | |
| р | 11 | 5.8% | |
| м | 10 | 5.2% | |
| с | 9 | 4.7% | |
| в | 9 | 4.7% | |
| я | 7 | 3.7% | |
| л | 6 | 3.1% | |
| и | 5 | 2.6% | |
| К | 4 | 2.1% | |
| д | 4 | 2.1% | |
| у | 3 | 1.6% | |
| М | 3 | 1.6% | |
| б | 2 | 1.0% | |
| й | 2 | 1.0% | |
| Н | 2 | 1.0% | |
| ё | 2 | 1.0% | |
| х | 1 | 0.5% | |
| г | 1 | 0.5% | |
| ь | 1 | 0.5% | |
| ю | 1 | 0.5% | |
| Other values (8) | 8 | 4.2% |
Most frequent Hebrew characters
| Value | Count | Frequency (%) | |
| י | 5 | 16.1% | |
| ו | 5 | 16.1% | |
| ב | 4 | 12.9% | |
| ר | 4 | 12.9% | |
| ע | 2 | 6.5% | |
| ת | 2 | 6.5% | |
| ה | 2 | 6.5% | |
| ד | 1 | 3.2% | |
| ש | 1 | 3.2% | |
| ל | 1 | 3.2% | |
| א | 1 | 3.2% | |
| מ | 1 | 3.2% | |
| ס | 1 | 3.2% | |
| ג | 1 | 3.2% |
Most frequent Georgian characters
| Value | Count | Frequency (%) | |
| ღ | 13 | 100.0% |
Most frequent Hangul characters
| Value | Count | Frequency (%) | |
| 한 | 7 | 10.0% | |
| 웃 | 3 | 4.3% | |
| 성 | 3 | 4.3% | |
| 스 | 2 | 2.9% | |
| 리 | 2 | 2.9% | |
| 고 | 2 | 2.9% | |
| 맨 | 2 | 2.9% | |
| 하 | 2 | 2.9% | |
| 소 | 2 | 2.9% | |
| 따 | 2 | 2.9% | |
| 뜻 | 2 | 2.9% | |
| 작 | 2 | 2.9% | |
| 은 | 2 | 2.9% | |
| 건 | 2 | 2.9% | |
| 물 | 2 | 2.9% | |
| 유 | 1 | 1.4% | |
| 웨 | 1 | 1.4% | |
| 트 | 1 | 1.4% | |
| 빌 | 1 | 1.4% | |
| 지 | 1 | 1.4% | |
| 에 | 1 | 1.4% | |
| 위 | 1 | 1.4% | |
| 치 | 1 | 1.4% | |
| 급 | 1 | 1.4% | |
| 튜 | 1 | 1.4% | |
| Other values (23) | 23 | 32.9% |
Most frequent Devanagari characters
| Value | Count | Frequency (%) | |
| ॐ | 2 | 100.0% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 1799735 | 99.8% | |
| CJK | 2237 | 0.1% | |
| Misc Symbols | 500 | < 0.1% | |
| None | 431 | < 0.1% | |
| Punctuation | 423 | < 0.1% | |
| Dingbats | 320 | < 0.1% | |
| Cyrillic | 191 | < 0.1% | |
| VS | 179 | < 0.1% | |
| Hiragana | 70 | < 0.1% | |
| Hangul | 70 | < 0.1% | |
| Hebrew | 31 | < 0.1% | |
| Georgian | 13 | < 0.1% | |
| Geometric Shapes | 11 | < 0.1% | |
| Math Operators | 7 | < 0.1% | |
| Misc Technical | 7 | < 0.1% | |
| Devanagari | 2 | < 0.1% | |
| Letterlike Symbols | 1 | < 0.1% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| 251424 | 14.0% | ||
| e | 124635 | 6.9% | |
| o | 122324 | 6.8% | |
| t | 105261 | 5.8% | |
| a | 103602 | 5.8% | |
| r | 97946 | 5.4% | |
| i | 94651 | 5.3% | |
| n | 94643 | 5.3% | |
| l | 51723 | 2.9% | |
| m | 49121 | 2.7% | |
| s | 48092 | 2.7% | |
| u | 46324 | 2.6% | |
| d | 38109 | 2.1% | |
| h | 31170 | 1.7% | |
| B | 29965 | 1.7% | |
| p | 29765 | 1.7% | |
| y | 28894 | 1.6% | |
| S | 26481 | 1.5% | |
| c | 23841 | 1.3% | |
| g | 21698 | 1.2% | |
| C | 20989 | 1.2% | |
| A | 19424 | 1.1% | |
| w | 19006 | 1.1% | |
| R | 17945 | 1.0% | |
| b | 17763 | 1.0% | |
| Other values (71) | 284939 | 15.8% |
Most frequent Punctuation characters
| Value | Count | Frequency (%) | |
| ’ | 200 | 47.3% | |
| • | 62 | 14.7% | |
| — | 47 | 11.1% | |
| “ | 40 | 9.5% | |
| ” | 38 | 9.0% | |
| – | 26 | 6.1% | |
| ‘ | 8 | 1.9% | |
| ― | 1 | 0.2% | |
| ‿ | 1 | 0.2% |
Most frequent None characters
| Value | Count | Frequency (%) | |
| ⭐ | 35 | 8.1% | |
| à | 28 | 6.5% | |
| ó | 24 | 5.6% | |
| ゙ | 21 | 4.9% | |
| é | 16 | 3.7% | |
| 。 | 15 | 3.5% | |
| ン | 14 | 3.2% | |
| · | 13 | 3.0% | |
| ク | 12 | 2.8% | |
| ー | 11 | 2.6% | |
| リ | 10 | 2.3% | |
| ² | 9 | 2.1% | |
| ハ | 9 | 2.1% | |
| ッ | 9 | 2.1% | |
| ア | 9 | 2.1% | |
| Ô | 9 | 2.1% | |
| 【 | 8 | 1.9% | |
| 】 | 8 | 1.9% | |
| ス | 8 | 1.9% | |
| ä | 7 | 1.6% | |
| ト | 7 | 1.6% | |
| í | 6 | 1.4% | |
| 、 | 6 | 1.4% | |
| フ | 6 | 1.4% | |
| ウ | 6 | 1.4% | |
| Other values (55) | 125 | 29.0% |
Most frequent VS characters
| Value | Count | Frequency (%) | |
| ️ | 165 | 92.2% | |
| ︎ | 14 | 7.8% |
Most frequent Misc Symbols characters
| Value | Count | Frequency (%) | |
| ★ | 266 | 53.2% | |
| ☆ | 105 | 21.0% | |
| ♥ | 38 | 7.6% | |
| ☀ | 15 | 3.0% | |
| ♀ | 11 | 2.2% | |
| ⚡ | 8 | 1.6% | |
| ♡ | 6 | 1.2% | |
| ♛ | 6 | 1.2% | |
| ♦ | 6 | 1.2% | |
| ⚓ | 6 | 1.2% | |
| ⚜ | 4 | 0.8% | |
| ♕ | 4 | 0.8% | |
| ♪ | 4 | 0.8% | |
| ☼ | 3 | 0.6% | |
| ☺ | 3 | 0.6% | |
| ♔ | 3 | 0.6% | |
| ♂ | 3 | 0.6% | |
| ☯ | 3 | 0.6% | |
| ☝ | 2 | 0.4% | |
| ☕ | 2 | 0.4% | |
| ☞ | 1 | 0.2% | |
| ⚾ | 1 | 0.2% |
Most frequent Dingbats characters
| Value | Count | Frequency (%) | |
| ❤ | 168 | 52.5% | |
| ✨ | 34 | 10.6% | |
| ❥ | 25 | 7.8% | |
| ✿ | 15 | 4.7% | |
| ✰ | 14 | 4.4% | |
| ✴ | 11 | 3.4% | |
| ✪ | 8 | 2.5% | |
| ✌ | 6 | 1.9% | |
| ➡ | 5 | 1.6% | |
| ✦ | 4 | 1.2% | |
| ✺ | 4 | 1.2% | |
| ✮ | 3 | 0.9% | |
| ✔ | 3 | 0.9% | |
| ✈ | 3 | 0.9% | |
| ✤ | 3 | 0.9% | |
| ✩ | 2 | 0.6% | |
| ❋ | 2 | 0.6% | |
| ✭ | 2 | 0.6% | |
| ❣ | 2 | 0.6% | |
| ❀ | 2 | 0.6% | |
| ➖ | 2 | 0.6% | |
| ➽ | 1 | 0.3% | |
| ✧ | 1 | 0.3% |
Most frequent Letterlike Symbols characters
| Value | Count | Frequency (%) | |
| ™ | 1 | 100.0% |
Most frequent CJK characters
| Value | Count | Frequency (%) | |
| 房 | 82 | 3.7% | |
| 家 | 46 | 2.1% | |
| 中 | 44 | 2.0% | |
| 间 | 41 | 1.8% | |
| 的 | 38 | 1.7% | |
| 拉 | 37 | 1.7% | |
| 法 | 36 | 1.6% | |
| 盛 | 36 | 1.6% | |
| 大 | 30 | 1.3% | |
| 约 | 29 | 1.3% | |
| 纽 | 28 | 1.3% | |
| 人 | 28 | 1.3% | |
| 地 | 27 | 1.2% | |
| 分 | 26 | 1.2% | |
| 公 | 25 | 1.1% | |
| 心 | 25 | 1.1% | |
| 近 | 25 | 1.1% | |
| 温 | 23 | 1.0% | |
| 馨 | 21 | 0.9% | |
| 单 | 21 | 0.9% | |
| 曼 | 20 | 0.9% | |
| 立 | 20 | 0.9% | |
| 便 | 20 | 0.9% | |
| 旅 | 20 | 0.9% | |
| 寓 | 19 | 0.8% | |
| Other values (386) | 1470 | 65.7% |
Most frequent Hiragana characters
| Value | Count | Frequency (%) | |
| の | 16 | 22.9% | |
| で | 7 | 10.0% | |
| か | 7 | 10.0% | |
| ら | 6 | 8.6% | |
| お | 5 | 7.1% | |
| い | 4 | 5.7% | |
| な | 4 | 5.7% | |
| に | 3 | 4.3% | |
| き | 2 | 2.9% | |
| く | 2 | 2.9% | |
| す | 2 | 2.9% | |
| し | 1 | 1.4% | |
| て | 1 | 1.4% | |
| み | 1 | 1.4% | |
| ま | 1 | 1.4% | |
| せ | 1 | 1.4% | |
| ん | 1 | 1.4% | |
| る | 1 | 1.4% | |
| が | 1 | 1.4% | |
| わ | 1 | 1.4% | |
| ど | 1 | 1.4% | |
| こ | 1 | 1.4% | |
| も | 1 | 1.4% |
Most frequent Cyrillic characters
| Value | Count | Frequency (%) | |
| а | 26 | 13.6% | |
| о | 18 | 9.4% | |
| т | 17 | 8.9% | |
| н | 15 | 7.9% | |
| е | 13 | 6.8% | |
| к | 11 | 5.8% | |
| р | 11 | 5.8% | |
| м | 10 | 5.2% | |
| с | 9 | 4.7% | |
| в | 9 | 4.7% | |
| я | 7 | 3.7% | |
| л | 6 | 3.1% | |
| и | 5 | 2.6% | |
| К | 4 | 2.1% | |
| д | 4 | 2.1% | |
| у | 3 | 1.6% | |
| М | 3 | 1.6% | |
| б | 2 | 1.0% | |
| й | 2 | 1.0% | |
| Н | 2 | 1.0% | |
| ё | 2 | 1.0% | |
| х | 1 | 0.5% | |
| г | 1 | 0.5% | |
| ь | 1 | 0.5% | |
| ю | 1 | 0.5% | |
| Other values (8) | 8 | 4.2% |
Most frequent Hebrew characters
| Value | Count | Frequency (%) | |
| י | 5 | 16.1% | |
| ו | 5 | 16.1% | |
| ב | 4 | 12.9% | |
| ר | 4 | 12.9% | |
| ע | 2 | 6.5% | |
| ת | 2 | 6.5% | |
| ה | 2 | 6.5% | |
| ד | 1 | 3.2% | |
| ש | 1 | 3.2% | |
| ל | 1 | 3.2% | |
| א | 1 | 3.2% | |
| מ | 1 | 3.2% | |
| ס | 1 | 3.2% | |
| ג | 1 | 3.2% |
Most frequent Georgian characters
| Value | Count | Frequency (%) | |
| ღ | 13 | 100.0% |
Most frequent Hangul characters
| Value | Count | Frequency (%) | |
| 한 | 7 | 10.0% | |
| 웃 | 3 | 4.3% | |
| 성 | 3 | 4.3% | |
| 스 | 2 | 2.9% | |
| 리 | 2 | 2.9% | |
| 고 | 2 | 2.9% | |
| 맨 | 2 | 2.9% | |
| 하 | 2 | 2.9% | |
| 소 | 2 | 2.9% | |
| 따 | 2 | 2.9% | |
| 뜻 | 2 | 2.9% | |
| 작 | 2 | 2.9% | |
| 은 | 2 | 2.9% | |
| 건 | 2 | 2.9% | |
| 물 | 2 | 2.9% | |
| 유 | 1 | 1.4% | |
| 웨 | 1 | 1.4% | |
| 트 | 1 | 1.4% | |
| 빌 | 1 | 1.4% | |
| 지 | 1 | 1.4% | |
| 에 | 1 | 1.4% | |
| 위 | 1 | 1.4% | |
| 치 | 1 | 1.4% | |
| 급 | 1 | 1.4% | |
| 튜 | 1 | 1.4% | |
| Other values (23) | 23 | 32.9% |
Most frequent Geometric Shapes characters
| Value | Count | Frequency (%) | |
| ▲ | 4 | 36.4% | |
| ◔ | 2 | 18.2% | |
| △ | 2 | 18.2% | |
| ◈ | 2 | 18.2% | |
| ▶ | 1 | 9.1% |
Most frequent Devanagari characters
| Value | Count | Frequency (%) | |
| ॐ | 2 | 100.0% |
Most frequent Math Operators characters
| Value | Count | Frequency (%) | |
| ⋆ | 4 | 57.1% | |
| √ | 2 | 28.6% | |
| ⊹ | 1 | 14.3% |
Most frequent Misc Technical characters
| Value | Count | Frequency (%) | |
| ⍟ | 4 | 57.1% | |
| ⏩ | 1 | 14.3% | |
| ⏪ | 1 | 14.3% | |
| ⌚ | 1 | 14.3% |
host_id
Real number (ℝ≥0)
| Distinct count | 37457 |
|---|---|
| Unique (%) | 76.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 67620010.64661008 |
|---|---|
| Minimum | 2438 |
| Maximum | 274321313 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 382.1 KiB |
Quantile statistics
| Minimum | 2438 |
|---|---|
| 5-th percentile | 815564.1 |
| Q1 | 7822033 |
| median | 30793816 |
| Q3 | 107434423 |
| 95-th percentile | 241764600.2 |
| Maximum | 274321313 |
| Range | 274318875 |
| Interquartile range (IQR) | 99612390 |
Descriptive statistics
| Standard deviation | 78610967.03 |
|---|---|
| Coefficient of variation (CV) | 1.162539998 |
| Kurtosis | 0.1691057556 |
| Mean | 67620010.65 |
| Median Absolute Deviation (MAD) | 27543913 |
| Skewness | 1.206213924 |
| Sum | 3.306280421e+12 |
| Variance | 6.179684138e+15 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 219517861 | 327 | 0.7% | |
| 107434423 | 232 | 0.5% | |
| 30283594 | 121 | 0.2% | |
| 137358866 | 103 | 0.2% | |
| 12243051 | 96 | 0.2% | |
| 16098958 | 96 | 0.2% | |
| 61391963 | 91 | 0.2% | |
| 22541573 | 87 | 0.2% | |
| 200380610 | 65 | 0.1% | |
| 7503643 | 52 | 0.1% | |
| 1475015 | 52 | 0.1% | |
| 120762452 | 50 | 0.1% | |
| 2856748 | 49 | 0.1% | |
| 205031545 | 49 | 0.1% | |
| 190921808 | 47 | 0.1% | |
| 26377263 | 43 | 0.1% | |
| 2119276 | 39 | 0.1% | |
| 19303369 | 37 | 0.1% | |
| 25237492 | 34 | 0.1% | |
| 119669058 | 34 | 0.1% | |
| 76104209 | 33 | 0.1% | |
| 113805886 | 33 | 0.1% | |
| 213781715 | 33 | 0.1% | |
| 238321374 | 32 | 0.1% | |
| 51501835 | 31 | 0.1% | |
| Other values (37432) | 47029 | 96.2% |
| Value | Count | Frequency (%) | |
| 2438 | 1 | < 0.1% | |
| 2571 | 1 | < 0.1% | |
| 2787 | 6 | < 0.1% | |
| 2845 | 2 | < 0.1% | |
| 2868 | 1 | < 0.1% | |
| 2881 | 2 | < 0.1% | |
| 3151 | 1 | < 0.1% | |
| 3211 | 1 | < 0.1% | |
| 3415 | 1 | < 0.1% | |
| 3563 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 274321313 | 1 | < 0.1% | |
| 274311461 | 1 | < 0.1% | |
| 274307600 | 1 | < 0.1% | |
| 274298453 | 1 | < 0.1% | |
| 274273284 | 1 | < 0.1% | |
| 274225617 | 1 | < 0.1% | |
| 274195458 | 1 | < 0.1% | |
| 274188386 | 1 | < 0.1% | |
| 274103383 | 1 | < 0.1% | |
| 274079964 | 1 | < 0.1% |
| Distinct count | 11452 |
|---|---|
| Unique (%) | 23.4% |
| Missing | 21 |
| Missing (%) | < 0.1% |
| Memory size | 382.1 KiB |
| Michael | 417 |
|---|---|
| David | 403 |
| Sonder (NYC) | 327 |
| John | 294 |
| Alex | 279 |
| Other values (11447) |
| Value | Count | Frequency (%) | |
| Michael | 417 | 0.9% | |
| David | 403 | 0.8% | |
| Sonder (NYC) | 327 | 0.7% | |
| John | 294 | 0.6% | |
| Alex | 279 | 0.6% | |
| Blueground | 232 | 0.5% | |
| Sarah | 227 | 0.5% | |
| Daniel | 226 | 0.5% | |
| Jessica | 205 | 0.4% | |
| Maria | 204 | 0.4% | |
| Mike | 194 | 0.4% | |
| Andrew | 190 | 0.4% | |
| Anna | 187 | 0.4% | |
| Chris | 182 | 0.4% | |
| Laura | 182 | 0.4% | |
| Melissa | 160 | 0.3% | |
| Emily | 157 | 0.3% | |
| Jennifer | 154 | 0.3% | |
| James | 151 | 0.3% | |
| Rachel | 146 | 0.3% | |
| Kara | 143 | 0.3% | |
| Amy | 142 | 0.3% | |
| Jonathan | 142 | 0.3% | |
| Jason | 140 | 0.3% | |
| Brian | 140 | 0.3% | |
| Other values (11427) | 43650 | 89.3% |
Length
| Max length | 35 |
|---|---|
| Median length | 6 |
| Mean length | 6.123530013 |
| Min length | 1 |
Most occurring characters
| Value | Count | Frequency (%) | |
| a | 37950 | 12.7% | |
| e | 28680 | 9.6% | |
| i | 24284 | 8.1% | |
| n | 24134 | 8.1% | |
| r | 17861 | 6.0% | |
| l | 15327 | 5.1% | |
| o | 12743 | 4.3% | |
| t | 9401 | 3.1% | |
| s | 9147 | 3.1% | |
| h | 9040 | 3.0% | |
| y | 7441 | 2.5% | |
| d | 7116 | 2.4% | |
| A | 6458 | 2.2% | |
| u | 5967 | 2.0% | |
| 5805 | 1.9% | ||
| J | 5458 | 1.8% | |
| m | 5309 | 1.8% | |
| c | 5301 | 1.8% | |
| M | 5298 | 1.8% | |
| S | 4744 | 1.6% | |
| C | 3737 | 1.2% | |
| L | 2885 | 1.0% | |
| D | 2752 | 0.9% | |
| g | 2670 | 0.9% | |
| K | 2618 | 0.9% | |
| Other values (179) | 37284 | 12.5% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 235979 | 78.8% | |
| Uppercase Letter | 54823 | 18.3% | |
| Space Separator | 5811 | 1.9% | |
| Other Punctuation | 1592 | 0.5% | |
| Open Punctuation | 381 | 0.1% | |
| Close Punctuation | 379 | 0.1% | |
| Dash Punctuation | 209 | 0.1% | |
| Other Letter | 110 | < 0.1% | |
| Decimal Number | 84 | < 0.1% | |
| Math Symbol | 34 | < 0.1% | |
| Format | 2 | < 0.1% | |
| Other Symbol | 2 | < 0.1% | |
| Final Punctuation | 2 | < 0.1% | |
| Currency Symbol | 1 | < 0.1% | |
| Connector Punctuation | 1 | < 0.1% |
Most frequent Uppercase Letter characters
| Value | Count | Frequency (%) | |
| A | 6458 | 11.8% | |
| J | 5458 | 10.0% | |
| M | 5298 | 9.7% | |
| S | 4744 | 8.7% | |
| C | 3737 | 6.8% | |
| L | 2885 | 5.3% | |
| D | 2752 | 5.0% | |
| K | 2618 | 4.8% | |
| R | 2566 | 4.7% | |
| E | 2361 | 4.3% | |
| N | 2179 | 4.0% | |
| B | 2120 | 3.9% | |
| T | 1871 | 3.4% | |
| G | 1458 | 2.7% | |
| P | 1412 | 2.6% | |
| H | 1389 | 2.5% | |
| Y | 1122 | 2.0% | |
| V | 971 | 1.8% | |
| F | 919 | 1.7% | |
| I | 800 | 1.5% | |
| W | 567 | 1.0% | |
| O | 534 | 1.0% | |
| Z | 355 | 0.6% | |
| X | 90 | 0.2% | |
| U | 73 | 0.1% | |
| Other values (13) | 86 | 0.2% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| a | 37950 | 16.1% | |
| e | 28680 | 12.2% | |
| i | 24284 | 10.3% | |
| n | 24134 | 10.2% | |
| r | 17861 | 7.6% | |
| l | 15327 | 6.5% | |
| o | 12743 | 5.4% | |
| t | 9401 | 4.0% | |
| s | 9147 | 3.9% | |
| h | 9040 | 3.8% | |
| y | 7441 | 3.2% | |
| d | 7116 | 3.0% | |
| u | 5967 | 2.5% | |
| m | 5309 | 2.2% | |
| c | 5301 | 2.2% | |
| g | 2670 | 1.1% | |
| k | 2586 | 1.1% | |
| v | 2270 | 1.0% | |
| b | 2139 | 0.9% | |
| p | 1331 | 0.6% | |
| f | 1203 | 0.5% | |
| z | 1059 | 0.4% | |
| x | 952 | 0.4% | |
| w | 904 | 0.4% | |
| j | 609 | 0.3% | |
| Other values (39) | 555 | 0.2% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 5805 | 99.9% | ||
| 6 | 0.1% |
Most frequent Other Punctuation characters
| Value | Count | Frequency (%) | |
| & | 1162 | 73.0% | |
| . | 309 | 19.4% | |
| / | 41 | 2.6% | |
| , | 35 | 2.2% | |
| ' | 25 | 1.6% | |
| @ | 8 | 0.5% | |
| " | 6 | 0.4% | |
| ! | 4 | 0.3% | |
| : | 2 | 0.1% |
Most frequent Math Symbol characters
| Value | Count | Frequency (%) | |
| + | 34 | 100.0% |
Most frequent Open Punctuation characters
| Value | Count | Frequency (%) | |
| ( | 381 | 100.0% |
Most frequent Close Punctuation characters
| Value | Count | Frequency (%) | |
| ) | 379 | 100.0% |
Most frequent Dash Punctuation characters
| Value | Count | Frequency (%) | |
| - | 209 | 100.0% |
Most frequent Decimal Number characters
| Value | Count | Frequency (%) | |
| 5 | 20 | 23.8% | |
| 0 | 14 | 16.7% | |
| 7 | 14 | 16.7% | |
| 2 | 11 | 13.1% | |
| 1 | 7 | 8.3% | |
| 4 | 7 | 8.3% | |
| 3 | 4 | 4.8% | |
| 6 | 4 | 4.8% | |
| 8 | 2 | 2.4% | |
| 9 | 1 | 1.2% |
Most frequent Other Letter characters
| Value | Count | Frequency (%) | |
| 明 | 6 | 5.5% | |
| 青 | 5 | 4.5% | |
| 美 | 5 | 4.5% | |
| 德 | 5 | 4.5% | |
| 文 | 4 | 3.6% | |
| 常 | 3 | 2.7% | |
| 春 | 3 | 2.7% | |
| 铀 | 3 | 2.7% | |
| 正 | 2 | 1.8% | |
| 川 | 2 | 1.8% | |
| 柏 | 2 | 1.8% | |
| 润 | 2 | 1.8% | |
| 祥 | 2 | 1.8% | |
| 茵 | 2 | 1.8% | |
| 소 | 2 | 1.8% | |
| 정 | 2 | 1.8% | |
| 辣 | 2 | 1.8% | |
| 泽 | 2 | 1.8% | |
| 宇 | 2 | 1.8% | |
| 静 | 2 | 1.8% | |
| 立 | 1 | 0.9% | |
| 威 | 1 | 0.9% | |
| 卷 | 1 | 0.9% | |
| 妮 | 1 | 0.9% | |
| 단 | 1 | 0.9% | |
| Other values (47) | 47 | 42.7% |
Most frequent Format characters
| Value | Count | Frequency (%) | |
| | 2 | 100.0% |
Most frequent Currency Symbol characters
| Value | Count | Frequency (%) | |
| £ | 1 | 100.0% |
Most frequent Connector Punctuation characters
| Value | Count | Frequency (%) | |
| _ | 1 | 100.0% |
Most frequent Other Symbol characters
| Value | Count | Frequency (%) | |
| ★ | 2 | 100.0% |
Most frequent Final Punctuation characters
| Value | Count | Frequency (%) | |
| ’ | 2 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 290746 | 97.1% | |
| Common | 8498 | 2.8% | |
| Han | 91 | < 0.1% | |
| Cyrillic | 56 | < 0.1% | |
| Hangul | 11 | < 0.1% | |
| Hebrew | 5 | < 0.1% | |
| Hiragana | 3 | < 0.1% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| a | 37950 | 13.1% | |
| e | 28680 | 9.9% | |
| i | 24284 | 8.4% | |
| n | 24134 | 8.3% | |
| r | 17861 | 6.1% | |
| l | 15327 | 5.3% | |
| o | 12743 | 4.4% | |
| t | 9401 | 3.2% | |
| s | 9147 | 3.1% | |
| h | 9040 | 3.1% | |
| y | 7441 | 2.6% | |
| d | 7116 | 2.4% | |
| A | 6458 | 2.2% | |
| u | 5967 | 2.1% | |
| J | 5458 | 1.9% | |
| m | 5309 | 1.8% | |
| c | 5301 | 1.8% | |
| M | 5298 | 1.8% | |
| S | 4744 | 1.6% | |
| C | 3737 | 1.3% | |
| L | 2885 | 1.0% | |
| D | 2752 | 0.9% | |
| g | 2670 | 0.9% | |
| K | 2618 | 0.9% | |
| k | 2586 | 0.9% | |
| Other values (55) | 31839 | 11.0% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 5805 | 68.3% | ||
| & | 1162 | 13.7% | |
| ( | 381 | 4.5% | |
| ) | 379 | 4.5% | |
| . | 309 | 3.6% | |
| - | 209 | 2.5% | |
| / | 41 | 0.5% | |
| , | 35 | 0.4% | |
| + | 34 | 0.4% | |
| ' | 25 | 0.3% | |
| 5 | 20 | 0.2% | |
| 0 | 14 | 0.2% | |
| 7 | 14 | 0.2% | |
| 2 | 11 | 0.1% | |
| @ | 8 | 0.1% | |
| 1 | 7 | 0.1% | |
| 4 | 7 | 0.1% | |
| " | 6 | 0.1% | |
| 6 | 0.1% | ||
| ! | 4 | < 0.1% | |
| 3 | 4 | < 0.1% | |
| 6 | 4 | < 0.1% | |
| : | 2 | < 0.1% | |
| | 2 | < 0.1% | |
| 8 | 2 | < 0.1% | |
| Other values (5) | 7 | 0.1% |
Most frequent Han characters
| Value | Count | Frequency (%) | |
| 明 | 6 | 6.6% | |
| 青 | 5 | 5.5% | |
| 美 | 5 | 5.5% | |
| 德 | 5 | 5.5% | |
| 文 | 4 | 4.4% | |
| 常 | 3 | 3.3% | |
| 春 | 3 | 3.3% | |
| 铀 | 3 | 3.3% | |
| 正 | 2 | 2.2% | |
| 川 | 2 | 2.2% | |
| 柏 | 2 | 2.2% | |
| 润 | 2 | 2.2% | |
| 祥 | 2 | 2.2% | |
| 茵 | 2 | 2.2% | |
| 辣 | 2 | 2.2% | |
| 泽 | 2 | 2.2% | |
| 宇 | 2 | 2.2% | |
| 静 | 2 | 2.2% | |
| 立 | 1 | 1.1% | |
| 威 | 1 | 1.1% | |
| 卷 | 1 | 1.1% | |
| 妮 | 1 | 1.1% | |
| 孙 | 1 | 1.1% | |
| 浩 | 1 | 1.1% | |
| 岑 | 1 | 1.1% | |
| Other values (30) | 30 | 33.0% |
Most frequent Hangul characters
| Value | Count | Frequency (%) | |
| 소 | 2 | 18.2% | |
| 정 | 2 | 18.2% | |
| 단 | 1 | 9.1% | |
| 비 | 1 | 9.1% | |
| 진 | 1 | 9.1% | |
| 현 | 1 | 9.1% | |
| 선 | 1 | 9.1% | |
| 빈 | 1 | 9.1% | |
| 나 | 1 | 9.1% |
Most frequent Cyrillic characters
| Value | Count | Frequency (%) | |
| е | 6 | 10.7% | |
| н | 6 | 10.7% | |
| а | 6 | 10.7% | |
| А | 4 | 7.1% | |
| л | 4 | 7.1% | |
| и | 4 | 7.1% | |
| к | 3 | 5.4% | |
| с | 3 | 5.4% | |
| й | 3 | 5.4% | |
| р | 3 | 5.4% | |
| Т | 2 | 3.6% | |
| д | 2 | 3.6% | |
| т | 1 | 1.8% | |
| В | 1 | 1.8% | |
| ь | 1 | 1.8% | |
| З | 1 | 1.8% | |
| і | 1 | 1.8% | |
| Ю | 1 | 1.8% | |
| я | 1 | 1.8% | |
| С | 1 | 1.8% | |
| г | 1 | 1.8% | |
| О | 1 | 1.8% |
Most frequent Hebrew characters
| Value | Count | Frequency (%) | |
| ד | 1 | 20.0% | |
| נ | 1 | 20.0% | |
| י | 1 | 20.0% | |
| א | 1 | 20.0% | |
| ל | 1 | 20.0% |
Most frequent Hiragana characters
| Value | Count | Frequency (%) | |
| ゆ | 1 | 33.3% | |
| り | 1 | 33.3% | |
| あ | 1 | 33.3% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 298985 | 99.9% | |
| None | 247 | 0.1% | |
| CJK | 91 | < 0.1% | |
| Cyrillic | 56 | < 0.1% | |
| Hangul | 11 | < 0.1% | |
| Punctuation | 10 | < 0.1% | |
| Hebrew | 5 | < 0.1% | |
| Hiragana | 3 | < 0.1% | |
| Misc Symbols | 2 | < 0.1% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| a | 37950 | 12.7% | |
| e | 28680 | 9.6% | |
| i | 24284 | 8.1% | |
| n | 24134 | 8.1% | |
| r | 17861 | 6.0% | |
| l | 15327 | 5.1% | |
| o | 12743 | 4.3% | |
| t | 9401 | 3.1% | |
| s | 9147 | 3.1% | |
| h | 9040 | 3.0% | |
| y | 7441 | 2.5% | |
| d | 7116 | 2.4% | |
| A | 6458 | 2.2% | |
| u | 5967 | 2.0% | |
| 5805 | 1.9% | ||
| J | 5458 | 1.8% | |
| m | 5309 | 1.8% | |
| c | 5301 | 1.8% | |
| M | 5298 | 1.8% | |
| S | 4744 | 1.6% | |
| C | 3737 | 1.2% | |
| L | 2885 | 1.0% | |
| D | 2752 | 0.9% | |
| g | 2670 | 0.9% | |
| K | 2618 | 0.9% | |
| Other values (52) | 36859 | 12.3% |
Most frequent None characters
| Value | Count | Frequency (%) | |
| é | 107 | 43.3% | |
| í | 24 | 9.7% | |
| á | 22 | 8.9% | |
| ú | 19 | 7.7% | |
| ë | 13 | 5.3% | |
| ô | 11 | 4.5% | |
| ó | 9 | 3.6% | |
| è | 7 | 2.8% | |
| ç | 5 | 2.0% | |
| ï | 4 | 1.6% | |
| ı | 4 | 1.6% | |
| ü | 3 | 1.2% | |
| ã | 2 | 0.8% | |
| ø | 2 | 0.8% | |
| û | 1 | 0.4% | |
| ý | 1 | 0.4% | |
| ö | 1 | 0.4% | |
| ğ | 1 | 0.4% | |
| æ | 1 | 0.4% | |
| £ | 1 | 0.4% | |
| É | 1 | 0.4% | |
| Ā | 1 | 0.4% | |
| ū | 1 | 0.4% | |
| Ö | 1 | 0.4% | |
| ò | 1 | 0.4% | |
| Other values (4) | 4 | 1.6% |
Most frequent CJK characters
| Value | Count | Frequency (%) | |
| 明 | 6 | 6.6% | |
| 青 | 5 | 5.5% | |
| 美 | 5 | 5.5% | |
| 德 | 5 | 5.5% | |
| 文 | 4 | 4.4% | |
| 常 | 3 | 3.3% | |
| 春 | 3 | 3.3% | |
| 铀 | 3 | 3.3% | |
| 正 | 2 | 2.2% | |
| 川 | 2 | 2.2% | |
| 柏 | 2 | 2.2% | |
| 润 | 2 | 2.2% | |
| 祥 | 2 | 2.2% | |
| 茵 | 2 | 2.2% | |
| 辣 | 2 | 2.2% | |
| 泽 | 2 | 2.2% | |
| 宇 | 2 | 2.2% | |
| 静 | 2 | 2.2% | |
| 立 | 1 | 1.1% | |
| 威 | 1 | 1.1% | |
| 卷 | 1 | 1.1% | |
| 妮 | 1 | 1.1% | |
| 孙 | 1 | 1.1% | |
| 浩 | 1 | 1.1% | |
| 岑 | 1 | 1.1% | |
| Other values (30) | 30 | 33.0% |
Most frequent Punctuation characters
| Value | Count | Frequency (%) | |
| 6 | 60.0% | ||
| | 2 | 20.0% | |
| ’ | 2 | 20.0% |
Most frequent Hangul characters
| Value | Count | Frequency (%) | |
| 소 | 2 | 18.2% | |
| 정 | 2 | 18.2% | |
| 단 | 1 | 9.1% | |
| 비 | 1 | 9.1% | |
| 진 | 1 | 9.1% | |
| 현 | 1 | 9.1% | |
| 선 | 1 | 9.1% | |
| 빈 | 1 | 9.1% | |
| 나 | 1 | 9.1% |
Most frequent Misc Symbols characters
| Value | Count | Frequency (%) | |
| ★ | 2 | 100.0% |
Most frequent Cyrillic characters
| Value | Count | Frequency (%) | |
| е | 6 | 10.7% | |
| н | 6 | 10.7% | |
| а | 6 | 10.7% | |
| А | 4 | 7.1% | |
| л | 4 | 7.1% | |
| и | 4 | 7.1% | |
| к | 3 | 5.4% | |
| с | 3 | 5.4% | |
| й | 3 | 5.4% | |
| р | 3 | 5.4% | |
| Т | 2 | 3.6% | |
| д | 2 | 3.6% | |
| т | 1 | 1.8% | |
| В | 1 | 1.8% | |
| ь | 1 | 1.8% | |
| З | 1 | 1.8% | |
| і | 1 | 1.8% | |
| Ю | 1 | 1.8% | |
| я | 1 | 1.8% | |
| С | 1 | 1.8% | |
| г | 1 | 1.8% | |
| О | 1 | 1.8% |
Most frequent Hebrew characters
| Value | Count | Frequency (%) | |
| ד | 1 | 20.0% | |
| נ | 1 | 20.0% | |
| י | 1 | 20.0% | |
| א | 1 | 20.0% | |
| ל | 1 | 20.0% |
Most frequent Hiragana characters
| Value | Count | Frequency (%) | |
| ゆ | 1 | 33.3% | |
| り | 1 | 33.3% | |
| あ | 1 | 33.3% |
neighbourhood_group
Categorical
| Distinct count | 5 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 382.1 KiB |
| Manhattan | |
|---|---|
| Brooklyn | |
| Queens | |
| Bronx | 1091 |
| Staten Island | 373 |
| Value | Count | Frequency (%) | |
| Manhattan | 21661 | 44.3% | |
| Brooklyn | 20104 | 41.1% | |
| Queens | 5666 | 11.6% | |
| Bronx | 1091 | 2.2% | |
| Staten Island | 373 | 0.8% |
Length
| Max length | 13 |
|---|---|
| Median length | 8 |
| Mean length | 8.182452193 |
| Min length | 5 |
Most occurring characters
| Value | Count | Frequency (%) | |
| n | 70929 | 17.7% | |
| a | 65729 | 16.4% | |
| t | 44068 | 11.0% | |
| o | 41299 | 10.3% | |
| M | 21661 | 5.4% | |
| h | 21661 | 5.4% | |
| B | 21195 | 5.3% | |
| r | 21195 | 5.3% | |
| l | 20477 | 5.1% | |
| k | 20104 | 5.0% | |
| y | 20104 | 5.0% | |
| e | 11705 | 2.9% | |
| s | 6039 | 1.5% | |
| Q | 5666 | 1.4% | |
| u | 5666 | 1.4% | |
| x | 1091 | 0.3% | |
| S | 373 | 0.1% | |
| 373 | 0.1% | ||
| I | 373 | 0.1% | |
| d | 373 | 0.1% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 350440 | 87.6% | |
| Uppercase Letter | 49268 | 12.3% | |
| Space Separator | 373 | 0.1% |
Most frequent Uppercase Letter characters
| Value | Count | Frequency (%) | |
| M | 21661 | 44.0% | |
| B | 21195 | 43.0% | |
| Q | 5666 | 11.5% | |
| S | 373 | 0.8% | |
| I | 373 | 0.8% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| n | 70929 | 20.2% | |
| a | 65729 | 18.8% | |
| t | 44068 | 12.6% | |
| o | 41299 | 11.8% | |
| h | 21661 | 6.2% | |
| r | 21195 | 6.0% | |
| l | 20477 | 5.8% | |
| k | 20104 | 5.7% | |
| y | 20104 | 5.7% | |
| e | 11705 | 3.3% | |
| s | 6039 | 1.7% | |
| u | 5666 | 1.6% | |
| x | 1091 | 0.3% | |
| d | 373 | 0.1% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 373 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 399708 | 99.9% | |
| Common | 373 | 0.1% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| n | 70929 | 17.7% | |
| a | 65729 | 16.4% | |
| t | 44068 | 11.0% | |
| o | 41299 | 10.3% | |
| M | 21661 | 5.4% | |
| h | 21661 | 5.4% | |
| B | 21195 | 5.3% | |
| r | 21195 | 5.3% | |
| l | 20477 | 5.1% | |
| k | 20104 | 5.0% | |
| y | 20104 | 5.0% | |
| e | 11705 | 2.9% | |
| s | 6039 | 1.5% | |
| Q | 5666 | 1.4% | |
| u | 5666 | 1.4% | |
| x | 1091 | 0.3% | |
| S | 373 | 0.1% | |
| I | 373 | 0.1% | |
| d | 373 | 0.1% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 373 | 100.0% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 400081 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| n | 70929 | 17.7% | |
| a | 65729 | 16.4% | |
| t | 44068 | 11.0% | |
| o | 41299 | 10.3% | |
| M | 21661 | 5.4% | |
| h | 21661 | 5.4% | |
| B | 21195 | 5.3% | |
| r | 21195 | 5.3% | |
| l | 20477 | 5.1% | |
| k | 20104 | 5.0% | |
| y | 20104 | 5.0% | |
| e | 11705 | 2.9% | |
| s | 6039 | 1.5% | |
| Q | 5666 | 1.4% | |
| u | 5666 | 1.4% | |
| x | 1091 | 0.3% | |
| S | 373 | 0.1% | |
| 373 | 0.1% | ||
| I | 373 | 0.1% | |
| d | 373 | 0.1% |
| Distinct count | 221 |
|---|---|
| Unique (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 382.1 KiB |
| Williamsburg | 3920 |
|---|---|
| Bedford-Stuyvesant | 3714 |
| Harlem | 2658 |
| Bushwick | 2465 |
| Upper West Side | 1971 |
| Other values (216) |
| Value | Count | Frequency (%) | |
| Williamsburg | 3920 | 8.0% | |
| Bedford-Stuyvesant | 3714 | 7.6% | |
| Harlem | 2658 | 5.4% | |
| Bushwick | 2465 | 5.0% | |
| Upper West Side | 1971 | 4.0% | |
| Hell's Kitchen | 1958 | 4.0% | |
| East Village | 1853 | 3.8% | |
| Upper East Side | 1798 | 3.7% | |
| Crown Heights | 1564 | 3.2% | |
| Midtown | 1545 | 3.2% | |
| East Harlem | 1117 | 2.3% | |
| Greenpoint | 1115 | 2.3% | |
| Chelsea | 1113 | 2.3% | |
| Lower East Side | 911 | 1.9% | |
| Astoria | 900 | 1.8% | |
| Washington Heights | 899 | 1.8% | |
| West Village | 768 | 1.6% | |
| Financial District | 744 | 1.5% | |
| Flatbush | 621 | 1.3% | |
| Clinton Hill | 572 | 1.2% | |
| Long Island City | 537 | 1.1% | |
| Prospect-Lefferts Gardens | 535 | 1.1% | |
| Park Slope | 506 | 1.0% | |
| East Flatbush | 500 | 1.0% | |
| Fort Greene | 489 | 1.0% | |
| Other values (196) | 14122 | 28.9% |
Length
| Max length | 26 |
|---|---|
| Median length | 12 |
| Mean length | 11.89479497 |
| Min length | 4 |
Most occurring characters
| Value | Count | Frequency (%) | |
| e | 53470 | 9.2% | |
| i | 42282 | 7.3% | |
| s | 39625 | 6.8% | |
| t | 38587 | 6.6% | |
| a | 37608 | 6.5% | |
| l | 34448 | 5.9% | |
| r | 33667 | 5.8% | |
| 30210 | 5.2% | ||
| n | 26099 | 4.5% | |
| o | 24032 | 4.1% | |
| d | 19663 | 3.4% | |
| h | 14868 | 2.6% | |
| u | 14624 | 2.5% | |
| g | 14589 | 2.5% | |
| H | 11901 | 2.0% | |
| S | 11483 | 2.0% | |
| p | 11418 | 2.0% | |
| m | 9704 | 1.7% | |
| w | 9573 | 1.6% | |
| c | 9384 | 1.6% | |
| B | 8374 | 1.4% | |
| W | 8185 | 1.4% | |
| y | 7489 | 1.3% | |
| E | 7084 | 1.2% | |
| b | 5733 | 1.0% | |
| Other values (29) | 57496 | 9.9% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 461107 | 79.3% | |
| Uppercase Letter | 83934 | 14.4% | |
| Space Separator | 30210 | 5.2% | |
| Dash Punctuation | 4251 | 0.7% | |
| Other Punctuation | 2094 | 0.4% |
Most frequent Uppercase Letter characters
| Value | Count | Frequency (%) | |
| H | 11901 | 14.2% | |
| S | 11483 | 13.7% | |
| B | 8374 | 10.0% | |
| W | 8185 | 9.8% | |
| E | 7084 | 8.4% | |
| C | 5327 | 6.3% | |
| U | 3833 | 4.6% | |
| G | 3723 | 4.4% | |
| F | 3281 | 3.9% | |
| V | 3209 | 3.8% | |
| M | 2953 | 3.5% | |
| K | 2741 | 3.3% | |
| P | 2421 | 2.9% | |
| L | 2193 | 2.6% | |
| D | 1577 | 1.9% | |
| A | 1127 | 1.3% | |
| R | 1122 | 1.3% | |
| I | 1024 | 1.2% | |
| T | 827 | 1.0% | |
| N | 666 | 0.8% | |
| J | 444 | 0.5% | |
| Y | 232 | 0.3% | |
| O | 147 | 0.2% | |
| Q | 60 | 0.1% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| e | 53470 | 11.6% | |
| i | 42282 | 9.2% | |
| s | 39625 | 8.6% | |
| t | 38587 | 8.4% | |
| a | 37608 | 8.2% | |
| l | 34448 | 7.5% | |
| r | 33667 | 7.3% | |
| n | 26099 | 5.7% | |
| o | 24032 | 5.2% | |
| d | 19663 | 4.3% | |
| h | 14868 | 3.2% | |
| u | 14624 | 3.2% | |
| g | 14589 | 3.2% | |
| p | 11418 | 2.5% | |
| m | 9704 | 2.1% | |
| w | 9573 | 2.1% | |
| c | 9384 | 2.0% | |
| y | 7489 | 1.6% | |
| b | 5733 | 1.2% | |
| f | 4934 | 1.1% | |
| k | 4784 | 1.0% | |
| v | 4392 | 1.0% | |
| z | 105 | < 0.1% | |
| x | 19 | < 0.1% | |
| q | 10 | < 0.1% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 30210 | 100.0% |
Most frequent Dash Punctuation characters
| Value | Count | Frequency (%) | |
| - | 4251 | 100.0% |
Most frequent Other Punctuation characters
| Value | Count | Frequency (%) | |
| ' | 1968 | 94.0% | |
| . | 124 | 5.9% | |
| , | 2 | 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 545041 | 93.7% | |
| Common | 36555 | 6.3% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| e | 53470 | 9.8% | |
| i | 42282 | 7.8% | |
| s | 39625 | 7.3% | |
| t | 38587 | 7.1% | |
| a | 37608 | 6.9% | |
| l | 34448 | 6.3% | |
| r | 33667 | 6.2% | |
| n | 26099 | 4.8% | |
| o | 24032 | 4.4% | |
| d | 19663 | 3.6% | |
| h | 14868 | 2.7% | |
| u | 14624 | 2.7% | |
| g | 14589 | 2.7% | |
| H | 11901 | 2.2% | |
| S | 11483 | 2.1% | |
| p | 11418 | 2.1% | |
| m | 9704 | 1.8% | |
| w | 9573 | 1.8% | |
| c | 9384 | 1.7% | |
| B | 8374 | 1.5% | |
| W | 8185 | 1.5% | |
| y | 7489 | 1.4% | |
| E | 7084 | 1.3% | |
| b | 5733 | 1.1% | |
| C | 5327 | 1.0% | |
| Other values (24) | 45824 | 8.4% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 30210 | 82.6% | ||
| - | 4251 | 11.6% | |
| ' | 1968 | 5.4% | |
| . | 124 | 0.3% | |
| , | 2 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 581596 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| e | 53470 | 9.2% | |
| i | 42282 | 7.3% | |
| s | 39625 | 6.8% | |
| t | 38587 | 6.6% | |
| a | 37608 | 6.5% | |
| l | 34448 | 5.9% | |
| r | 33667 | 5.8% | |
| 30210 | 5.2% | ||
| n | 26099 | 4.5% | |
| o | 24032 | 4.1% | |
| d | 19663 | 3.4% | |
| h | 14868 | 2.6% | |
| u | 14624 | 2.5% | |
| g | 14589 | 2.5% | |
| H | 11901 | 2.0% | |
| S | 11483 | 2.0% | |
| p | 11418 | 2.0% | |
| m | 9704 | 1.7% | |
| w | 9573 | 1.6% | |
| c | 9384 | 1.6% | |
| B | 8374 | 1.4% | |
| W | 8185 | 1.4% | |
| y | 7489 | 1.3% | |
| E | 7084 | 1.2% | |
| b | 5733 | 1.0% | |
| Other values (29) | 57496 | 9.9% |
latitude
Real number (ℝ≥0)
| Distinct count | 19048 |
|---|---|
| Unique (%) | 39.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 40.72894888066264 |
|---|---|
| Minimum | 40.499790000000004 |
| Maximum | 40.913059999999994 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 382.1 KiB |
Quantile statistics
| Minimum | 40.49979 |
|---|---|
| 5-th percentile | 40.646114 |
| Q1 | 40.6901 |
| median | 40.72307 |
| Q3 | 40.763115 |
| 95-th percentile | 40.825643 |
| Maximum | 40.91306 |
| Range | 0.41327 |
| Interquartile range (IQR) | 0.073015 |
Descriptive statistics
| Standard deviation | 0.05453007806 |
|---|---|
| Coefficient of variation (CV) | 0.001338853065 |
| Kurtosis | 0.1488446574 |
| Mean | 40.72894888 |
| Median Absolute Deviation (MAD) | 0.03642 |
| Skewness | 0.2371665585 |
| Sum | 1991441.956 |
| Variance | 0.002973529413 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 40.71813 | 18 | < 0.1% | |
| 40.68634 | 13 | < 0.1% | |
| 40.69414 | 13 | < 0.1% | |
| 40.68444 | 13 | < 0.1% | |
| 40.71171 | 12 | < 0.1% | |
| 40.68537 | 12 | < 0.1% | |
| 40.76189 | 12 | < 0.1% | |
| 40.76125 | 12 | < 0.1% | |
| 40.71353 | 12 | < 0.1% | |
| 40.69054 | 11 | < 0.1% | |
| 40.70766 | 11 | < 0.1% | |
| 40.71239 | 11 | < 0.1% | |
| 40.76106 | 11 | < 0.1% | |
| 40.72607 | 11 | < 0.1% | |
| 40.6881 | 11 | < 0.1% | |
| 40.7191 | 11 | < 0.1% | |
| 40.68683 | 11 | < 0.1% | |
| 40.7069 | 11 | < 0.1% | |
| 40.76769 | 11 | < 0.1% | |
| 40.69454 | 11 | < 0.1% | |
| 40.68589 | 11 | < 0.1% | |
| 40.71923 | 11 | < 0.1% | |
| 40.71947 | 10 | < 0.1% | |
| 40.76273 | 10 | < 0.1% | |
| 40.72434 | 10 | < 0.1% | |
| Other values (19023) | 48605 | 99.4% |
| Value | Count | Frequency (%) | |
| 40.49979 | 1 | < 0.1% | |
| 40.50641 | 1 | < 0.1% | |
| 40.50708 | 1 | < 0.1% | |
| 40.50868 | 1 | < 0.1% | |
| 40.50873 | 1 | < 0.1% | |
| 40.50943 | 1 | < 0.1% | |
| 40.51133 | 1 | < 0.1% | |
| 40.52211 | 1 | < 0.1% | |
| 40.52293 | 1 | < 0.1% | |
| 40.527 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 40.91306 | 1 | < 0.1% | |
| 40.91234 | 1 | < 0.1% | |
| 40.91169 | 1 | < 0.1% | |
| 40.91167 | 1 | < 0.1% | |
| 40.90804 | 1 | < 0.1% | |
| 40.90734 | 1 | < 0.1% | |
| 40.90527 | 1 | < 0.1% | |
| 40.90484 | 1 | < 0.1% | |
| 40.90406 | 1 | < 0.1% | |
| 40.90391 | 1 | < 0.1% |
longitude
Real number (ℝ)
| Distinct count | 14718 |
|---|---|
| Unique (%) | 30.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -73.95216961468454 |
|---|---|
| Minimum | -74.24441999999999 |
| Maximum | -73.71299 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 382.1 KiB |
Quantile statistics
| Minimum | -74.24442 |
|---|---|
| 5-th percentile | -74.00388 |
| Q1 | -73.98307 |
| median | -73.95568 |
| Q3 | -73.936275 |
| 95-th percentile | -73.865771 |
| Maximum | -73.71299 |
| Range | 0.53143 |
| Interquartile range (IQR) | 0.046795 |
Descriptive statistics
| Standard deviation | 0.04615673611 |
|---|---|
| Coefficient of variation (CV) | -0.0006241430961 |
| Kurtosis | 5.021646112 |
| Mean | -73.95216961 |
| Median Absolute Deviation (MAD) | 0.02485 |
| Skewness | 1.284210209 |
| Sum | -3615891.333 |
| Variance | 0.002130444288 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| -73.95677 | 18 | < 0.1% | |
| -73.95427 | 18 | < 0.1% | |
| -73.95405 | 17 | < 0.1% | |
| -73.95136 | 16 | < 0.1% | |
| -73.94791 | 16 | < 0.1% | |
| -73.9506 | 16 | < 0.1% | |
| -73.95332 | 16 | < 0.1% | |
| -73.95725 | 15 | < 0.1% | |
| -73.98589 | 15 | < 0.1% | |
| -73.95669 | 15 | < 0.1% | |
| -73.94537 | 15 | < 0.1% | |
| -73.95742 | 15 | < 0.1% | |
| -73.98439 | 15 | < 0.1% | |
| -73.95439 | 14 | < 0.1% | |
| -73.9435 | 14 | < 0.1% | |
| -73.94965 | 14 | < 0.1% | |
| -73.95443 | 14 | < 0.1% | |
| -73.95175 | 14 | < 0.1% | |
| -73.98507 | 14 | < 0.1% | |
| -73.95688 | 14 | < 0.1% | |
| -73.94349 | 14 | < 0.1% | |
| -73.9535 | 14 | < 0.1% | |
| -73.94977 | 14 | < 0.1% | |
| -73.94813 | 14 | < 0.1% | |
| -73.98668 | 14 | < 0.1% | |
| Other values (14693) | 48520 | 99.2% |
| Value | Count | Frequency (%) | |
| -74.24442 | 1 | < 0.1% | |
| -74.24285 | 1 | < 0.1% | |
| -74.24084 | 1 | < 0.1% | |
| -74.23986 | 1 | < 0.1% | |
| -74.23914 | 1 | < 0.1% | |
| -74.23803 | 1 | < 0.1% | |
| -74.23059 | 1 | < 0.1% | |
| -74.21238 | 1 | < 0.1% | |
| -74.21017 | 1 | < 0.1% | |
| -74.20941 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| -73.71299 | 1 | < 0.1% | |
| -73.7169 | 1 | < 0.1% | |
| -73.71795 | 1 | < 0.1% | |
| -73.71829 | 1 | < 0.1% | |
| -73.71928 | 1 | < 0.1% | |
| -73.72173 | 1 | < 0.1% | |
| -73.72179 | 1 | < 0.1% | |
| -73.72247 | 1 | < 0.1% | |
| -73.72435 | 1 | < 0.1% | |
| -73.72581 | 1 | < 0.1% |
room_type
Categorical
| Distinct count | 3 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 382.1 KiB |
| Entire home/apt | |
|---|---|
| Private room | |
| Shared room | 1160 |
| Value | Count | Frequency (%) | |
| Entire home/apt | 25409 | 52.0% | |
| Private room | 22326 | 45.7% | |
| Shared room | 1160 | 2.4% |
Length
| Max length | 15 |
|---|---|
| Median length | 15 |
| Mean length | 13.53526945 |
| Min length | 11 |
Most occurring characters
| Value | Count | Frequency (%) | |
| e | 74304 | 11.2% | |
| t | 73144 | 11.1% | |
| r | 72381 | 10.9% | |
| o | 72381 | 10.9% | |
| a | 48895 | 7.4% | |
| 48895 | 7.4% | ||
| m | 48895 | 7.4% | |
| i | 47735 | 7.2% | |
| h | 26569 | 4.0% | |
| E | 25409 | 3.8% | |
| n | 25409 | 3.8% | |
| / | 25409 | 3.8% | |
| p | 25409 | 3.8% | |
| P | 22326 | 3.4% | |
| v | 22326 | 3.4% | |
| S | 1160 | 0.2% | |
| d | 1160 | 0.2% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 538608 | 81.4% | |
| Uppercase Letter | 48895 | 7.4% | |
| Space Separator | 48895 | 7.4% | |
| Other Punctuation | 25409 | 3.8% |
Most frequent Uppercase Letter characters
| Value | Count | Frequency (%) | |
| E | 25409 | 52.0% | |
| P | 22326 | 45.7% | |
| S | 1160 | 2.4% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| e | 74304 | 13.8% | |
| t | 73144 | 13.6% | |
| r | 72381 | 13.4% | |
| o | 72381 | 13.4% | |
| a | 48895 | 9.1% | |
| m | 48895 | 9.1% | |
| i | 47735 | 8.9% | |
| h | 26569 | 4.9% | |
| n | 25409 | 4.7% | |
| p | 25409 | 4.7% | |
| v | 22326 | 4.1% | |
| d | 1160 | 0.2% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 48895 | 100.0% |
Most frequent Other Punctuation characters
| Value | Count | Frequency (%) | |
| / | 25409 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 587503 | 88.8% | |
| Common | 74304 | 11.2% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| e | 74304 | 12.6% | |
| t | 73144 | 12.4% | |
| r | 72381 | 12.3% | |
| o | 72381 | 12.3% | |
| a | 48895 | 8.3% | |
| m | 48895 | 8.3% | |
| i | 47735 | 8.1% | |
| h | 26569 | 4.5% | |
| E | 25409 | 4.3% | |
| n | 25409 | 4.3% | |
| p | 25409 | 4.3% | |
| P | 22326 | 3.8% | |
| v | 22326 | 3.8% | |
| S | 1160 | 0.2% | |
| d | 1160 | 0.2% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 48895 | 65.8% | ||
| / | 25409 | 34.2% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 661807 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| e | 74304 | 11.2% | |
| t | 73144 | 11.1% | |
| r | 72381 | 10.9% | |
| o | 72381 | 10.9% | |
| a | 48895 | 7.4% | |
| 48895 | 7.4% | ||
| m | 48895 | 7.4% | |
| i | 47735 | 7.2% | |
| h | 26569 | 4.0% | |
| E | 25409 | 3.8% | |
| n | 25409 | 3.8% | |
| / | 25409 | 3.8% | |
| p | 25409 | 3.8% | |
| P | 22326 | 3.4% | |
| v | 22326 | 3.4% | |
| S | 1160 | 0.2% | |
| d | 1160 | 0.2% |
price
Real number (ℝ≥0)
| Distinct count | 674 |
|---|---|
| Unique (%) | 1.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 152.7206871868289 |
|---|---|
| Minimum | 0 |
| Maximum | 10000 |
| Zeros | 11 |
| Zeros (%) | < 0.1% |
| Memory size | 382.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 40 |
| Q1 | 69 |
| median | 106 |
| Q3 | 175 |
| 95-th percentile | 355 |
| Maximum | 10000 |
| Range | 10000 |
| Interquartile range (IQR) | 106 |
Descriptive statistics
| Standard deviation | 240.1541697 |
|---|---|
| Coefficient of variation (CV) | 1.572505822 |
| Kurtosis | 585.6728789 |
| Mean | 152.7206872 |
| Median Absolute Deviation (MAD) | 46 |
| Skewness | 19.118939 |
| Sum | 7467278 |
| Variance | 57674.02525 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 100 | 2051 | 4.2% | |
| 150 | 2047 | 4.2% | |
| 50 | 1534 | 3.1% | |
| 60 | 1458 | 3.0% | |
| 200 | 1401 | 2.9% | |
| 75 | 1370 | 2.8% | |
| 80 | 1272 | 2.6% | |
| 65 | 1190 | 2.4% | |
| 70 | 1170 | 2.4% | |
| 120 | 1130 | 2.3% | |
| 125 | 1057 | 2.2% | |
| 90 | 1021 | 2.1% | |
| 250 | 1018 | 2.1% | |
| 55 | 950 | 1.9% | |
| 45 | 891 | 1.8% | |
| 85 | 877 | 1.8% | |
| 40 | 771 | 1.6% | |
| 175 | 763 | 1.6% | |
| 99 | 742 | 1.5% | |
| 110 | 739 | 1.5% | |
| 95 | 700 | 1.4% | |
| 130 | 610 | 1.2% | |
| 300 | 561 | 1.1% | |
| 140 | 548 | 1.1% | |
| 180 | 522 | 1.1% | |
| Other values (649) | 22502 | 46.0% |
| Value | Count | Frequency (%) | |
| 0 | 11 | < 0.1% | |
| 10 | 17 | < 0.1% | |
| 11 | 3 | < 0.1% | |
| 12 | 4 | < 0.1% | |
| 13 | 1 | < 0.1% | |
| 15 | 6 | < 0.1% | |
| 16 | 6 | < 0.1% | |
| 18 | 2 | < 0.1% | |
| 19 | 4 | < 0.1% | |
| 20 | 33 | 0.1% |
| Value | Count | Frequency (%) | |
| 10000 | 3 | < 0.1% | |
| 9999 | 3 | < 0.1% | |
| 8500 | 1 | < 0.1% | |
| 8000 | 1 | < 0.1% | |
| 7703 | 1 | < 0.1% | |
| 7500 | 2 | < 0.1% | |
| 6800 | 1 | < 0.1% | |
| 6500 | 3 | < 0.1% | |
| 6419 | 1 | < 0.1% | |
| 6000 | 2 | < 0.1% |
| Distinct count | 109 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.029962163820431 |
|---|---|
| Minimum | 1 |
| Maximum | 1250 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 382.1 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 3 |
| Q3 | 5 |
| 95-th percentile | 30 |
| Maximum | 1250 |
| Range | 1249 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 20.51054953 |
|---|---|
| Coefficient of variation (CV) | 2.917590316 |
| Kurtosis | 854.0716624 |
| Mean | 7.029962164 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 21.82727453 |
| Sum | 343730 |
| Variance | 420.6826422 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 1 | 12720 | 26.0% | |
| 2 | 11696 | 23.9% | |
| 3 | 7999 | 16.4% | |
| 30 | 3760 | 7.7% | |
| 4 | 3303 | 6.8% | |
| 5 | 3034 | 6.2% | |
| 7 | 2058 | 4.2% | |
| 6 | 752 | 1.5% | |
| 14 | 562 | 1.1% | |
| 10 | 483 | 1.0% | |
| 29 | 340 | 0.7% | |
| 15 | 279 | 0.6% | |
| 20 | 223 | 0.5% | |
| 28 | 203 | 0.4% | |
| 31 | 201 | 0.4% | |
| 21 | 135 | 0.3% | |
| 8 | 130 | 0.3% | |
| 60 | 106 | 0.2% | |
| 90 | 104 | 0.2% | |
| 12 | 91 | 0.2% | |
| 25 | 82 | 0.2% | |
| 9 | 80 | 0.2% | |
| 13 | 54 | 0.1% | |
| 180 | 43 | 0.1% | |
| 11 | 33 | 0.1% | |
| Other values (84) | 424 | 0.9% |
| Value | Count | Frequency (%) | |
| 1 | 12720 | 26.0% | |
| 2 | 11696 | 23.9% | |
| 3 | 7999 | 16.4% | |
| 4 | 3303 | 6.8% | |
| 5 | 3034 | 6.2% | |
| 6 | 752 | 1.5% | |
| 7 | 2058 | 4.2% | |
| 8 | 130 | 0.3% | |
| 9 | 80 | 0.2% | |
| 10 | 483 | 1.0% |
| Value | Count | Frequency (%) | |
| 1250 | 1 | < 0.1% | |
| 1000 | 1 | < 0.1% | |
| 999 | 3 | < 0.1% | |
| 500 | 5 | < 0.1% | |
| 480 | 1 | < 0.1% | |
| 400 | 1 | < 0.1% | |
| 370 | 1 | < 0.1% | |
| 366 | 1 | < 0.1% | |
| 365 | 29 | 0.1% | |
| 364 | 1 | < 0.1% |
| Distinct count | 394 |
|---|---|
| Unique (%) | 0.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 23.274465691788528 |
|---|---|
| Minimum | 0 |
| Maximum | 629 |
| Zeros | 10052 |
| Zeros (%) | 20.6% |
| Memory size | 382.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 5 |
| Q3 | 24 |
| 95-th percentile | 114 |
| Maximum | 629 |
| Range | 629 |
| Interquartile range (IQR) | 23 |
Descriptive statistics
| Standard deviation | 44.55058227 |
|---|---|
| Coefficient of variation (CV) | 1.91413985 |
| Kurtosis | 19.52978807 |
| Mean | 23.27446569 |
| Median Absolute Deviation (MAD) | 5 |
| Skewness | 3.690634572 |
| Sum | 1138005 |
| Variance | 1984.75438 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 0 | 10052 | 20.6% | |
| 1 | 5244 | 10.7% | |
| 2 | 3465 | 7.1% | |
| 3 | 2520 | 5.2% | |
| 4 | 1994 | 4.1% | |
| 5 | 1618 | 3.3% | |
| 6 | 1357 | 2.8% | |
| 7 | 1179 | 2.4% | |
| 8 | 1127 | 2.3% | |
| 9 | 964 | 2.0% | |
| 10 | 803 | 1.6% | |
| 11 | 778 | 1.6% | |
| 12 | 682 | 1.4% | |
| 13 | 611 | 1.2% | |
| 14 | 575 | 1.2% | |
| 15 | 536 | 1.1% | |
| 16 | 471 | 1.0% | |
| 17 | 461 | 0.9% | |
| 18 | 417 | 0.9% | |
| 19 | 401 | 0.8% | |
| 20 | 391 | 0.8% | |
| 22 | 344 | 0.7% | |
| 23 | 337 | 0.7% | |
| 21 | 333 | 0.7% | |
| 25 | 313 | 0.6% | |
| Other values (369) | 11922 | 24.4% |
| Value | Count | Frequency (%) | |
| 0 | 10052 | 20.6% | |
| 1 | 5244 | 10.7% | |
| 2 | 3465 | 7.1% | |
| 3 | 2520 | 5.2% | |
| 4 | 1994 | 4.1% | |
| 5 | 1618 | 3.3% | |
| 6 | 1357 | 2.8% | |
| 7 | 1179 | 2.4% | |
| 8 | 1127 | 2.3% | |
| 9 | 964 | 2.0% |
| Value | Count | Frequency (%) | |
| 629 | 1 | < 0.1% | |
| 607 | 1 | < 0.1% | |
| 597 | 1 | < 0.1% | |
| 594 | 1 | < 0.1% | |
| 576 | 1 | < 0.1% | |
| 543 | 1 | < 0.1% | |
| 540 | 1 | < 0.1% | |
| 510 | 1 | < 0.1% | |
| 488 | 1 | < 0.1% | |
| 480 | 1 | < 0.1% |
| Distinct count | 1764 |
|---|---|
| Unique (%) | 4.5% |
| Missing | 10052 |
| Missing (%) | 20.6% |
| Memory size | 382.1 KiB |
| 2019-06-23 | 1413 |
|---|---|
| 2019-07-01 | 1359 |
| 2019-06-30 | 1341 |
| 2019-06-24 | 875 |
| 2019-07-07 | 718 |
| Other values (1759) |
| Value | Count | Frequency (%) | |
| 2019-06-23 | 1413 | 2.9% | |
| 2019-07-01 | 1359 | 2.8% | |
| 2019-06-30 | 1341 | 2.7% | |
| 2019-06-24 | 875 | 1.8% | |
| 2019-07-07 | 718 | 1.5% | |
| 2019-07-02 | 658 | 1.3% | |
| 2019-06-22 | 655 | 1.3% | |
| 2019-06-16 | 601 | 1.2% | |
| 2019-07-05 | 580 | 1.2% | |
| 2019-07-06 | 565 | 1.2% | |
| 2019-06-21 | 529 | 1.1% | |
| 2019-06-29 | 529 | 1.1% | |
| 2019-07-03 | 426 | 0.9% | |
| 2019-06-19 | 426 | 0.9% | |
| 2019-06-25 | 423 | 0.9% | |
| 2019-06-20 | 421 | 0.9% | |
| 2019-06-28 | 399 | 0.8% | |
| 2019-01-01 | 398 | 0.8% | |
| 2019-06-26 | 390 | 0.8% | |
| 2019-06-09 | 383 | 0.8% | |
| 2019-06-17 | 376 | 0.8% | |
| 2019-06-15 | 370 | 0.8% | |
| 2019-05-27 | 347 | 0.7% | |
| 2019-06-18 | 340 | 0.7% | |
| 2019-06-27 | 334 | 0.7% | |
| Other values (1739) | 23987 | 49.1% | |
| (Missing) | 10052 | 20.6% |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 8.560916249 |
| Min length | 3 |
Most occurring characters
| Value | Count | Frequency (%) | |
| 0 | 92333 | 22.1% | |
| - | 77686 | 18.6% | |
| 1 | 62027 | 14.8% | |
| 2 | 58684 | 14.0% | |
| 9 | 30106 | 7.2% | |
| n | 20104 | 4.8% | |
| 6 | 19890 | 4.8% | |
| 7 | 12824 | 3.1% | |
| 8 | 10838 | 2.6% | |
| a | 10052 | 2.4% | |
| 5 | 9577 | 2.3% | |
| 3 | 8764 | 2.1% | |
| 4 | 5701 | 1.4% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Decimal Number | 310744 | 74.2% | |
| Dash Punctuation | 77686 | 18.6% | |
| Lowercase Letter | 30156 | 7.2% |
Most frequent Decimal Number characters
| Value | Count | Frequency (%) | |
| 0 | 92333 | 29.7% | |
| 1 | 62027 | 20.0% | |
| 2 | 58684 | 18.9% | |
| 9 | 30106 | 9.7% | |
| 6 | 19890 | 6.4% | |
| 7 | 12824 | 4.1% | |
| 8 | 10838 | 3.5% | |
| 5 | 9577 | 3.1% | |
| 3 | 8764 | 2.8% | |
| 4 | 5701 | 1.8% |
Most frequent Dash Punctuation characters
| Value | Count | Frequency (%) | |
| - | 77686 | 100.0% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| n | 20104 | 66.7% | |
| a | 10052 | 33.3% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Common | 388430 | 92.8% | |
| Latin | 30156 | 7.2% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 0 | 92333 | 23.8% | |
| - | 77686 | 20.0% | |
| 1 | 62027 | 16.0% | |
| 2 | 58684 | 15.1% | |
| 9 | 30106 | 7.8% | |
| 6 | 19890 | 5.1% | |
| 7 | 12824 | 3.3% | |
| 8 | 10838 | 2.8% | |
| 5 | 9577 | 2.5% | |
| 3 | 8764 | 2.3% | |
| 4 | 5701 | 1.5% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| n | 20104 | 66.7% | |
| a | 10052 | 33.3% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 418586 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| 0 | 92333 | 22.1% | |
| - | 77686 | 18.6% | |
| 1 | 62027 | 14.8% | |
| 2 | 58684 | 14.0% | |
| 9 | 30106 | 7.2% | |
| n | 20104 | 4.8% | |
| 6 | 19890 | 4.8% | |
| 7 | 12824 | 3.1% | |
| 8 | 10838 | 2.6% | |
| a | 10052 | 2.4% | |
| 5 | 9577 | 2.3% | |
| 3 | 8764 | 2.1% | |
| 4 | 5701 | 1.4% |
| Distinct count | 937 |
|---|---|
| Unique (%) | 2.4% |
| Missing | 10052 |
| Missing (%) | 20.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.3732214298586618 |
|---|---|
| Minimum | 0.01 |
| Maximum | 58.5 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 382.1 KiB |
Quantile statistics
| Minimum | 0.01 |
|---|---|
| 5-th percentile | 0.04 |
| Q1 | 0.19 |
| median | 0.72 |
| Q3 | 2.02 |
| 95-th percentile | 4.64 |
| Maximum | 58.5 |
| Range | 58.49 |
| Interquartile range (IQR) | 1.83 |
Descriptive statistics
| Standard deviation | 1.680441995 |
|---|---|
| Coefficient of variation (CV) | 1.223722525 |
| Kurtosis | 42.49346948 |
| Mean | 1.37322143 |
| Median Absolute Deviation (MAD) | 0.62 |
| Skewness | 3.130188536 |
| Sum | 53340.04 |
| Variance | 2.823885299 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 0.02 | 919 | 1.9% | |
| 0.05 | 893 | 1.8% | |
| 1 | 893 | 1.8% | |
| 0.03 | 804 | 1.6% | |
| 0.16 | 667 | 1.4% | |
| 0.04 | 655 | 1.3% | |
| 0.08 | 596 | 1.2% | |
| 0.09 | 593 | 1.2% | |
| 0.06 | 579 | 1.2% | |
| 0.11 | 539 | 1.1% | |
| 0.07 | 466 | 1.0% | |
| 0.13 | 463 | 0.9% | |
| 0.1 | 457 | 0.9% | |
| 0.12 | 413 | 0.8% | |
| 2 | 406 | 0.8% | |
| 0.14 | 399 | 0.8% | |
| 0.15 | 374 | 0.8% | |
| 0.19 | 357 | 0.7% | |
| 0.21 | 343 | 0.7% | |
| 0.17 | 321 | 0.7% | |
| 0.22 | 318 | 0.7% | |
| 0.18 | 305 | 0.6% | |
| 0.26 | 305 | 0.6% | |
| 0.25 | 290 | 0.6% | |
| 0.23 | 289 | 0.6% | |
| Other values (912) | 26199 | 53.6% | |
| (Missing) | 10052 | 20.6% |
| Value | Count | Frequency (%) | |
| 0.01 | 42 | 0.1% | |
| 0.02 | 919 | 1.9% | |
| 0.03 | 804 | 1.6% | |
| 0.04 | 655 | 1.3% | |
| 0.05 | 893 | 1.8% | |
| 0.06 | 579 | 1.2% | |
| 0.07 | 466 | 1.0% | |
| 0.08 | 596 | 1.2% | |
| 0.09 | 593 | 1.2% | |
| 0.1 | 457 | 0.9% |
| Value | Count | Frequency (%) | |
| 58.5 | 1 | < 0.1% | |
| 27.95 | 1 | < 0.1% | |
| 20.94 | 1 | < 0.1% | |
| 19.75 | 1 | < 0.1% | |
| 17.82 | 1 | < 0.1% | |
| 16.81 | 1 | < 0.1% | |
| 16.22 | 1 | < 0.1% | |
| 16.03 | 1 | < 0.1% | |
| 15.78 | 1 | < 0.1% | |
| 15.32 | 1 | < 0.1% |
calculated_host_listings_count
Real number (ℝ≥0)
| Distinct count | 47 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.143982002249719 |
|---|---|
| Minimum | 1 |
| Maximum | 327 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 382.1 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 2 |
| 95-th percentile | 15 |
| Maximum | 327 |
| Range | 326 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 32.95251885 |
|---|---|
| Coefficient of variation (CV) | 4.612626241 |
| Kurtosis | 67.5508883 |
| Mean | 7.143982002 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 7.9331739 |
| Sum | 349305 |
| Variance | 1085.868499 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 1 | 32303 | 66.1% | |
| 2 | 6658 | 13.6% | |
| 3 | 2853 | 5.8% | |
| 4 | 1440 | 2.9% | |
| 5 | 845 | 1.7% | |
| 6 | 570 | 1.2% | |
| 8 | 416 | 0.9% | |
| 7 | 399 | 0.8% | |
| 327 | 327 | 0.7% | |
| 9 | 234 | 0.5% | |
| 232 | 232 | 0.5% | |
| 10 | 210 | 0.4% | |
| 96 | 192 | 0.4% | |
| 12 | 180 | 0.4% | |
| 13 | 130 | 0.3% | |
| 121 | 121 | 0.2% | |
| 11 | 110 | 0.2% | |
| 52 | 104 | 0.2% | |
| 103 | 103 | 0.2% | |
| 33 | 99 | 0.2% | |
| 49 | 98 | 0.2% | |
| 91 | 91 | 0.2% | |
| 87 | 87 | 0.2% | |
| 15 | 75 | 0.2% | |
| 14 | 70 | 0.1% | |
| Other values (22) | 948 | 1.9% |
| Value | Count | Frequency (%) | |
| 1 | 32303 | 66.1% | |
| 2 | 6658 | 13.6% | |
| 3 | 2853 | 5.8% | |
| 4 | 1440 | 2.9% | |
| 5 | 845 | 1.7% | |
| 6 | 570 | 1.2% | |
| 7 | 399 | 0.8% | |
| 8 | 416 | 0.9% | |
| 9 | 234 | 0.5% | |
| 10 | 210 | 0.4% |
| Value | Count | Frequency (%) | |
| 327 | 327 | 0.7% | |
| 232 | 232 | 0.5% | |
| 121 | 121 | 0.2% | |
| 103 | 103 | 0.2% | |
| 96 | 192 | 0.4% | |
| 91 | 91 | 0.2% | |
| 87 | 87 | 0.2% | |
| 65 | 65 | 0.1% | |
| 52 | 104 | 0.2% | |
| 50 | 50 | 0.1% |
| Distinct count | 366 |
|---|---|
| Unique (%) | 0.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 112.78132733408324 |
|---|---|
| Minimum | 0 |
| Maximum | 365 |
| Zeros | 17533 |
| Zeros (%) | 35.9% |
| Memory size | 382.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 45 |
| Q3 | 227 |
| 95-th percentile | 359 |
| Maximum | 365 |
| Range | 365 |
| Interquartile range (IQR) | 227 |
Descriptive statistics
| Standard deviation | 131.6222889 |
|---|---|
| Coefficient of variation (CV) | 1.167057455 |
| Kurtosis | -0.9975340452 |
| Mean | 112.7813273 |
| Median Absolute Deviation (MAD) | 45 |
| Skewness | 0.7634075771 |
| Sum | 5514443 |
| Variance | 17324.42692 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 0 | 17533 | 35.9% | |
| 365 | 1295 | 2.6% | |
| 364 | 491 | 1.0% | |
| 1 | 408 | 0.8% | |
| 89 | 361 | 0.7% | |
| 5 | 340 | 0.7% | |
| 3 | 306 | 0.6% | |
| 179 | 301 | 0.6% | |
| 90 | 290 | 0.6% | |
| 2 | 270 | 0.6% | |
| 6 | 245 | 0.5% | |
| 363 | 239 | 0.5% | |
| 4 | 233 | 0.5% | |
| 8 | 233 | 0.5% | |
| 342 | 230 | 0.5% | |
| 188 | 225 | 0.5% | |
| 7 | 219 | 0.4% | |
| 88 | 200 | 0.4% | |
| 341 | 199 | 0.4% | |
| 311 | 199 | 0.4% | |
| 9 | 193 | 0.4% | |
| 180 | 192 | 0.4% | |
| 83 | 183 | 0.4% | |
| 358 | 180 | 0.4% | |
| 14 | 173 | 0.4% | |
| Other values (341) | 24157 | 49.4% |
| Value | Count | Frequency (%) | |
| 0 | 17533 | 35.9% | |
| 1 | 408 | 0.8% | |
| 2 | 270 | 0.6% | |
| 3 | 306 | 0.6% | |
| 4 | 233 | 0.5% | |
| 5 | 340 | 0.7% | |
| 6 | 245 | 0.5% | |
| 7 | 219 | 0.4% | |
| 8 | 233 | 0.5% | |
| 9 | 193 | 0.4% |
| Value | Count | Frequency (%) | |
| 365 | 1295 | 2.6% | |
| 364 | 491 | 1.0% | |
| 363 | 239 | 0.5% | |
| 362 | 166 | 0.3% | |
| 361 | 111 | 0.2% | |
| 360 | 102 | 0.2% | |
| 359 | 135 | 0.3% | |
| 358 | 180 | 0.4% | |
| 357 | 95 | 0.2% | |
| 356 | 78 | 0.2% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| id | name | host_id | host_name | neighbourhood_group | neighbourhood | latitude | longitude | room_type | price | minimum_nights | number_of_reviews | last_review | reviews_per_month | calculated_host_listings_count | availability_365 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2539 | Clean & quiet apt home by the park | 2787 | John | Brooklyn | Kensington | 40.64749 | -73.97237 | Private room | 149 | 1 | 9 | 2018-10-19 | 0.21 | 6 | 365 |
| 1 | 2595 | Skylit Midtown Castle | 2845 | Jennifer | Manhattan | Midtown | 40.75362 | -73.98377 | Entire home/apt | 225 | 1 | 45 | 2019-05-21 | 0.38 | 2 | 355 |
| 2 | 3647 | THE VILLAGE OF HARLEM....NEW YORK ! | 4632 | Elisabeth | Manhattan | Harlem | 40.80902 | -73.94190 | Private room | 150 | 3 | 0 | NaN | NaN | 1 | 365 |
| 3 | 3831 | Cozy Entire Floor of Brownstone | 4869 | LisaRoxanne | Brooklyn | Clinton Hill | 40.68514 | -73.95976 | Entire home/apt | 89 | 1 | 270 | 2019-07-05 | 4.64 | 1 | 194 |
| 4 | 5022 | Entire Apt: Spacious Studio/Loft by central park | 7192 | Laura | Manhattan | East Harlem | 40.79851 | -73.94399 | Entire home/apt | 80 | 10 | 9 | 2018-11-19 | 0.10 | 1 | 0 |
| 5 | 5099 | Large Cozy 1 BR Apartment In Midtown East | 7322 | Chris | Manhattan | Murray Hill | 40.74767 | -73.97500 | Entire home/apt | 200 | 3 | 74 | 2019-06-22 | 0.59 | 1 | 129 |
| 6 | 5121 | BlissArtsSpace! | 7356 | Garon | Brooklyn | Bedford-Stuyvesant | 40.68688 | -73.95596 | Private room | 60 | 45 | 49 | 2017-10-05 | 0.40 | 1 | 0 |
| 7 | 5178 | Large Furnished Room Near B'way | 8967 | Shunichi | Manhattan | Hell's Kitchen | 40.76489 | -73.98493 | Private room | 79 | 2 | 430 | 2019-06-24 | 3.47 | 1 | 220 |
| 8 | 5203 | Cozy Clean Guest Room - Family Apt | 7490 | MaryEllen | Manhattan | Upper West Side | 40.80178 | -73.96723 | Private room | 79 | 2 | 118 | 2017-07-21 | 0.99 | 1 | 0 |
| 9 | 5238 | Cute & Cozy Lower East Side 1 bdrm | 7549 | Ben | Manhattan | Chinatown | 40.71344 | -73.99037 | Entire home/apt | 150 | 1 | 160 | 2019-06-09 | 1.33 | 4 | 188 |
Last rows
| id | name | host_id | host_name | neighbourhood_group | neighbourhood | latitude | longitude | room_type | price | minimum_nights | number_of_reviews | last_review | reviews_per_month | calculated_host_listings_count | availability_365 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 48885 | 36482809 | Stunning Bedroom NYC! Walking to Central Park!! | 131529729 | Kendall | Manhattan | East Harlem | 40.79633 | -73.93605 | Private room | 75 | 2 | 0 | NaN | NaN | 2 | 353 |
| 48886 | 36483010 | Comfy 1 Bedroom in Midtown East | 274311461 | Scott | Manhattan | Midtown | 40.75561 | -73.96723 | Entire home/apt | 200 | 6 | 0 | NaN | NaN | 1 | 176 |
| 48887 | 36483152 | Garden Jewel Apartment in Williamsburg New York | 208514239 | Melki | Brooklyn | Williamsburg | 40.71232 | -73.94220 | Entire home/apt | 170 | 1 | 0 | NaN | NaN | 3 | 365 |
| 48888 | 36484087 | Spacious Room w/ Private Rooftop, Central location | 274321313 | Kat | Manhattan | Hell's Kitchen | 40.76392 | -73.99183 | Private room | 125 | 4 | 0 | NaN | NaN | 1 | 31 |
| 48889 | 36484363 | QUIT PRIVATE HOUSE | 107716952 | Michael | Queens | Jamaica | 40.69137 | -73.80844 | Private room | 65 | 1 | 0 | NaN | NaN | 2 | 163 |
| 48890 | 36484665 | Charming one bedroom - newly renovated rowhouse | 8232441 | Sabrina | Brooklyn | Bedford-Stuyvesant | 40.67853 | -73.94995 | Private room | 70 | 2 | 0 | NaN | NaN | 2 | 9 |
| 48891 | 36485057 | Affordable room in Bushwick/East Williamsburg | 6570630 | Marisol | Brooklyn | Bushwick | 40.70184 | -73.93317 | Private room | 40 | 4 | 0 | NaN | NaN | 2 | 36 |
| 48892 | 36485431 | Sunny Studio at Historical Neighborhood | 23492952 | Ilgar & Aysel | Manhattan | Harlem | 40.81475 | -73.94867 | Entire home/apt | 115 | 10 | 0 | NaN | NaN | 1 | 27 |
| 48893 | 36485609 | 43rd St. Time Square-cozy single bed | 30985759 | Taz | Manhattan | Hell's Kitchen | 40.75751 | -73.99112 | Shared room | 55 | 1 | 0 | NaN | NaN | 6 | 2 |
| 48894 | 36487245 | Trendy duplex in the very heart of Hell's Kitchen | 68119814 | Christophe | Manhattan | Hell's Kitchen | 40.76404 | -73.98933 | Private room | 90 | 7 | 0 | NaN | NaN | 1 | 23 |